Chinese Verb Tense? Using English Parallel Data to Map Tense onto Chinese and Subsequent Tense Classification

Elizabeth Baran

Back

Chinese Verb Tense? Using English Parallel Data to Map Tense onto Chinese and Subsequent Tense Classification

Thesis

Open access

Chinese Verb Tense? Using English Parallel Data to Map Tense onto Chinese and Subsequent Tense Classification

Elizabeth Baran

Brandeis University

Master of Arts (MA), Brandeis University, Graduate School of Arts and Sciences

2013

Handle:

https://hdl.handle.net/10192/24533

Abstract

Chinese language processing

Time Expression Recognition

Verb Tense

Natural Language Processing

We explore time in Chinese by mapping tense information from a manually-aligned English parallel corpus onto Chinese verbs. We construct a detailed mapping procedure to accurately convey tense in English through combinations of word tokens and parts-of-speech and then transfer that information onto verbs in Chinese. We explore the resulting Chinese data set and discuss the pros and cons of this mapping technique. Using this Chinese data set, augmented with tense, we attempt to automatically predict the tense of each verb in Chinese using a Conditional Random Fields algorithm along with a suite of linguistic features. We include an algorithm for extracting and associating time expressions to verbs and integrate that as a feature into our tense prediction algorithm. We achieve a 34% accuracy gain over our baseline as well as a much deeper understanding of how tense can transfer between English and Chinese in a translation environment.

Files and links (1)

pdf

BaranThesis2013938.47 kBDownload View

PDF Open Access

Metrics

1653 File views/ downloads

71 Record Views

Details

Title: Chinese Verb Tense? Using English Parallel Data to Map Tense onto Chinese and Subsequent Tense Classification
Creators: Elizabeth Baran
Contributors: Nianwen Xue (Advisor)
Awarding Institution: Brandeis University, Graduate School of Arts and Sciences; Master of Arts (MA)
Theses and Dissertations: Master of Arts (MA), Brandeis University, Graduate School of Arts and Sciences
Publisher: Brandeis University
Grant note: Brandeis University, Graduate School of Arts and Sciences
Identifiers: 10192/24533; 9923879990001921
Academic Unit: Interdepartmental Program in Linguistics and Computational Linguistics
Language: English
Resource Type: Thesis

Chinese Verb Tense? Using English Parallel Data to Map Tense onto Chinese and Subsequent Tense Classification

Abstract

Files and links (1)

Metrics

Details

Brandeis University Social media