Abstract
This course provides an introduction to the construction of annotated linguistic corpora to serve the dual purposes of theoretical linguistic analysis and machine learning for NLP. This is done via a detailed exploration of the design and early construction of the Brandeis-Simmons Corpus of English VP (Verb Phrase) Ellipsis: the first syntactically annotated ellipsis corpus primarily containing transcriptions of naturally occurring spoken dialogue, as opposed to constructed text from newswire, journalistic essays, or fiction.