Applying Machine Learning to Usage of Aspect Markers in Chinese Text

Russell Entrikin

Back

Applying Machine Learning to Usage of Aspect Markers in Chinese Text

Thesis

Open access

Applying Machine Learning to Usage of Aspect Markers in Chinese Text

Russell Entrikin

Brandeis University

Master of Arts (MA), Brandeis University, Graduate School of Arts and Sciences

2012

Handle:

https://hdl.handle.net/10192/74

Abstract

Chinese

Mandarin

aspect markers

Machine Learning

One of the most difficult issues for learners of Chinese is understanding the way temporal information is marked in discourse. Like Indo-European languages, Chinese makes use of explicit temporal expressions, temporal adverbs, ordering of words and verb phrases, and pragmatics to communicate temporal relationships. However, Chinese lacks temporal inflection on verbs. An aspect marker may follow a verb, but crucially, these markers are considered optional in many contexts, and usage differs in different domains (e.g. written language, spoken language, official broadcast news). While all finite verbs in English are temporally marked in some way, the majority of verbs in most discourse will not be marked aspectually in Chinese. This lack of positive examples makes it extremely hard for learners to understand when aspect markers are licensed.\r Here, we explore the viability of using corpus linguistics techniques to create a sort of “discourse grammar”-checker for Chinese text which learners of Chinese can use to find errors in their own usage of aspect markers. We use a corpus-based machine learning approach to train a classifier on the usage of aspect markers and attempt to use this classifier to correctly posit aspect markers in unseen text. We discuss the capabilities and limits of our system, and how the optional and subjective nature of the placement of aspect markers blurs the notion of hits vs. false positives vs. false negatives, making evaluation difficult. We also sketch an annotation schema which would support a Chinese discourse-based aspect marker checking tool.

Files and links (1)

pdf

russellentrikin-MA2.70 MBDownload View

PDF Open Access

Metrics

1277 File views/ downloads

40 Record Views

Details

Title: Applying Machine Learning to Usage of Aspect Markers in Chinese Text
Creators: Russell Entrikin
Contributors: James Pustejovsky (Advisor)
Awarding Institution: Brandeis University, Graduate School of Arts and Sciences; Master of Arts (MA)
Theses and Dissertations: Master of Arts (MA), Brandeis University, Graduate School of Arts and Sciences
Publisher: Brandeis University
Grant note: Brandeis University, Graduate School of Arts and Sciences
Identifiers: 10192/74; 9923880086901921
Academic Unit: Michtom School of Computer Science
Language: English
Resource Type: Thesis

Applying Machine Learning to Usage of Aspect Markers in Chinese Text

Abstract

Files and links (1)

Metrics

Details

Brandeis University Social media