Statistical Parsing Exposed
This book develops techniques and methodologies for§the examination of the complex systems that are§lexicalized statistical parsing models. The primary§idea is treating the model as data , which is not a§particular method, but a paradigm and a research§methodology. I argue that lexicalized statistical§parsing models have become increasingly complex, and§therefore require thorough scrutiny, both to achieve§the scientific aim of understanding what has been§built thus far, and to achieve both the scientific§and engineering goal of using that understanding for§progress. In this book, I take a particular, dominant§type of parsing model and perform a macro analysis,§to reveal its core (and design a software engine that§modularizes the periphery), and also crucially§perform a detailed analysis, which provides for the§first time a window onto the efficacy of specific§parameters. These analyses have not only yielded§insight into the core model, but they have also§enabled the identification of inefficiencies in the§baseline model, such that those inefficiencies can be§reduced to form a more compact model, or exploited§for finding a better-estimated model with higher§accuracy, or both.