Recognizing Meaning in the Crowd: Building Word Sense Inventories on Amazon Mechanical Turk

Nicholas Botchan

doi:10.48617/etd.772

Back

Recognizing Meaning in the Crowd: Building Word Sense Inventories on Amazon Mechanical Turk

Thesis

Open access

Recognizing Meaning in the Crowd: Building Word Sense Inventories on Amazon Mechanical Turk

Nicholas Botchan

Brandeis University

Master of Arts (MA), Brandeis University, Graduate School of Arts and Sciences

2012

DOI:

https://doi.org/10.48617/etd.772

Handle:

https://hdl.handle.net/10192/50

Abstract

crowdsourcing

WSD

word sense

inventory

This thesis explores different strategies for constructing robust, inexpensive and empirically-derived word sense inventories and the corresponding sense-annotated\r corpus. All strategies explored rely on non-expert linguistic annotations collected\r through the use of the Amazon Mechanical Turk crowdsourcing marketplace. Experiments using implementation strategies with different quality control mechanisms are reported on in detail. Described herein are multiple best practices discovered through extensive system testing that are required to obtain high quality\r data given the challenge of using non-expert annotations. Results indicate that it is possible to obtain sense inventories that correlate with the gold standard,\r extending it in ways that may prove useful in a variety of other Natural Language\r Processing tasks.

Files and links (1)

pdf

thesis954.47 kBDownload View

PDF Open Access

Metrics

931 File views/ downloads

30 Record Views

Details

Title: Recognizing Meaning in the Crowd: Building Word Sense Inventories on Amazon Mechanical Turk
Creators: Nicholas Botchan
Contributors: Anna Rumshisky (Advisor)
Awarding Institution: Brandeis University, Graduate School of Arts and Sciences; Master of Arts (MA)
Theses and Dissertations: Master of Arts (MA), Brandeis University, Graduate School of Arts and Sciences
Publisher: Brandeis University
Grant note: Brandeis University, Graduate School of Arts and Sciences
Identifiers: 10192/50; 9923879934201921
Academic Unit: Interdepartmental Program in Linguistics and Computational Linguistics
Language: English
Resource Type: Thesis

Recognizing Meaning in the Crowd: Building Word Sense Inventories on Amazon Mechanical Turk

Abstract

Files and links (1)

Metrics

Details

Brandeis University Social media