Combining Classifiers for Chinese Word Segmentation

Nianwen Xue; Susan P Converse

Back

Combining Classifiers for Chinese Word Segmentation

Conference paper

Open access

Combining Classifiers for Chinese Word Segmentation

Nianwen Xue and Susan P Converse

SIGHAN '02: the first SIGHAN workshop on Chinese language processing, 1 (Taipei, Taiwan, 09/01/2002)

09/01/2002

Abstract

Segmentation (Linguistics)

Chinese Language or Literature

Computational Linguistics

Machine Learning

In this paper we report results of a supervised machine-learning approach to Chinese word segmentation. First, a maximum entropy tagger is trained on manually annotated data to automatically labels the characters with tags that indicate the position of character within a word. An error-driven transformation-based tagger is then trained to clean up the tagging inconsistencies of the first tagger. The tagged output is then converted into segmented text. The preliminary results show that this approach is competitive compared with other supervised machine-learning segmenters reported in previous studies.

Files and links (1)

url

Combining Classifiers for Chinese Word SegmentationView

paper bib and text Open

Metrics

39 Record Views

Details

Title: Combining Classifiers for Chinese Word Segmentation
Creators: Nianwen Xue (Author) - Brandeis University, Michtom School of Computer Science
Susan P Converse (Author) - University of Pennsylvania
Conference: SIGHAN '02: the first SIGHAN workshop on Chinese language processing, 1 (Taipei, Taiwan, 09/01/2002)
Number of pages: 7
Identifiers: 9924148851201921
Academic Unit: Benjamin and Mae Volen National Center for Complex Systems; Interdepartmental Program in Linguistics and Computational Linguistics; Michtom School of Computer Science
Language: Chinese; English
Resource Type: Conference paper

Combining Classifiers for Chinese Word Segmentation

Abstract

Files and links (1)

Metrics

Details

Brandeis University Social media