Building a Broad Infrastructure for Uniform Meaning Representations

Julia Bonn; Matthew Buchholz; Jayeol Chun; Andrew Cowell; William Croft; Lukas Denk; Sijia Ge; Jan Hajic; Kenneth Lai; James H. Martin; Skatje Myers; Alexis Palmer; Martha Palmer; Claire Benet Post; James Pustejovsky; Kristine Stenzel; Haibo Sun; Zderika Uresova; Rosa Vallejos; Jens E. L. Van Gysel; Meagan Vigus; Nianwen Xue; Jin Zhao

Conference proceeding

Building a Broad Infrastructure for Uniform Meaning Representations

Julia Bonn, Matthew Buchholz, Jayeol Chun, Andrew Cowell, William Croft, Lukas Denk, Sijia Ge, Jan Hajic, Kenneth Lai, James H. Martin, …

PROCEEDINGS OF THE 2024 JOINT INTERNATIONAL CONFERENCE ON COMPUTATIONAL LINGUISTICS, LANGUAGE RESOURCES AND EVALUATION, LREC-COLING 2024, pp.2537-2547

International Conference on Computational Linguistics Language Resources and Evaluation

01/01/2024

Handle:

https://hdl.handle.net/10192/79011

Abstract

Computer Science, Artificial Intelligence

Computer Science, Interdisciplinary Applications

Language & Linguistics

Linguistics

Science & Technology

Computer Science

Social Sciences

Technology

This paper reports the first release of the UMR (Uniform Meaning Representation) data set. UMR is a graph-based meaning representation formalism consisting of a sentence-level graph and a document-level graph. The sentence level graph represents predicate-argument structures, named entities, word senses, and aspectuality of events, as well as person and number information for entities. The document-level graph represents coreferential, temporal, and modal relations that go beyond sentence boundaries. UMR is designed to capture the commonalities and differences across languages; this is done through the use of a common set of abstract concepts, relations, and attributes as well as concrete concepts derived from words from individual languages. This UMR release includes annotations for six languages (Arapaho, Chinese, English, Kukama, Navajo, Sanapana) that vary greatly in terms of their linguistic properties and resource availability. We also describe on-going efforts to enlarge this data set and extend it to other genres and modalities. We also briefly describe the available infrastructure (UMR annotation guidelines and tools) that others can use to create similar data sets.

Metrics

1 Record Views

Details

Title: Building a Broad Infrastructure for Uniform Meaning Representations
Creators: Julia Bonn - University of Colorado Boulder
Matthew Buchholz - University of Colorado Boulder
Jayeol Chun - Brandeis University
Andrew Cowell - University of Colorado Boulder
William Croft - University of New Mexico
Lukas Denk - University of New Mexico
Sijia Ge - University of Colorado Boulder
Jan Hajic - Charles University
Kenneth Lai - Brandeis Univ, Waltham, MA USA
James H. Martin - University of Colorado Boulder
Skatje Myers - University of Colorado Boulder
Alexis Palmer - University of Colorado Boulder
Martha Palmer - University of Colorado Boulder
Claire Benet Post - University of Colorado Boulder
James Pustejovsky - Brandeis Univ, Waltham, MA USA
Kristine Stenzel - University of Colorado Boulder
Haibo Sun - Brandeis University
Zderika Uresova
Rosa Vallejos - University of New Mexico
Jens E. L. Van Gysel - Univ New Mexico, Albuquerque, NM 87131 USA
Meagan Vigus - University of New Mexico
Nianwen Xue - Brandeis Univ, Waltham, MA USA
Jin Zhao - Brandeis Univ, Waltham, MA USA
Contributors: N Calzolari (Editor)
M Y Kan (Editor)
Hoste (Editor)
A Lenci (Editor)
S Sakti (Editor)
N Xue (Editor)
Publication Details: PROCEEDINGS OF THE 2024 JOINT INTERNATIONAL CONFERENCE ON COMPUTATIONAL LINGUISTICS, LANGUAGE RESOURCES AND EVALUATION, LREC-COLING 2024, pp.2537-2547
Series: International Conference on Computational Linguistics Language Resources and Evaluation
Publisher: Assoc Computational Linguistics-Acl
Number of pages: 11
Grant note: NSF_2213805; NSF_2213804; NSF_IIS 1764048; NSF_1763926; LUAUS23283 / CNS Division of National Science Foundation LM2018101; LM2023062 / MSMT CR; Ministry of Education, Youth & Sports - Czech Republic LM2018101; LM2023062 / Czech Ministry of Education, Youth and Sports (MSMT CR)
Identifiers: 9924586150701921
Academic Unit: Michtom School of Computer Science; Benjamin and Mae Volen National Center for Complex Systems; Interdepartmental Program in Linguistics and Computational Linguistics
Language: English
Resource Type: Conference proceeding

Building a Broad Infrastructure for Uniform Meaning Representations

Abstract

Metrics

Details

Brandeis University Social media