Abstract
In this paper, we introduce a methodology for annotating compositional operations in natural language text and describe the Generative Lexicon Mark-up Language (GLML), a mark-up language inspired by the Generative Lexicon model, for identifying such relations. While most annotation systems capture surface relationships, GLML captures the “compositional history” of the argument selection relative to the predicate. We provide a brief overview of GL before moving on to our proposed methodology for annotating with GLML. There are three main tasks described in the paper. The first one is based on atomic semantic types and the other two exploit more fine-grained meaning parameters encoded in the Qualia Structure roles: (i) Argument Selection and Coercion Annotation for the SemEval-2010 competition; (ii) Qualia Selection in modification constructions; (iii) Type selection in modification constructions and verb-noun combinations involving dot objects. We explain what each task comprises and include the XML format for annotated sample sentences. We show that by identifying and subsequently annotating the typing and subtyping shifts in these constructions, we gain an insight into the workings of the general mechanisms of composition.