Identifying Technology Terms in Document Text

Olga Cherenina

doi:10.48617/etd.925

Back

Identifying Technology Terms in Document Text

Thesis

Open access

Identifying Technology Terms in Document Text

Olga Cherenina

Brandeis University

Master of Arts (MA), Brandeis University, Graduate School of Arts and Sciences

2013

DOI:

https://doi.org/10.48617/etd.925

Handle:

https://hdl.handle.net/10192/24530

Abstract

identifying technology terms

emergence

Finding technologies in the text of patents or other documents such as medical articles is a subtask of building a technology ontology. Building such a technology ontology was proposed by Brandeis scholars as part of the project aimed at patent classification based on a certain notion of availability of technologies relevant to the patent. Technology ontology represents a database of technologies evaluated by their availability within certain time frame, that is their maturity. Technology terms identification in the text of documents is an initial step necessary for building an ontology. The terms found in the text of the patent will reflect the notion of a technology and constitute the basis for technology maturity identification.\r \r \r Here, we explore the efficiency of using natural language processing techniques to help identify technologies in patent text. We attempt at creating and using a matcher that uses lexical and syntactic features to look for technology terms. We address the problem of determining the concept of a technology which is important for the task and use an annotation for the evaluation of the matcher. Finally, we analyze the results and propose improvements to the system.

Files and links (1)

pdf

ChereninaThesis2013930.07 kBDownload View

PDF Open Access

Metrics

Details

Title: Identifying Technology Terms in Document Text
Creators: Olga Cherenina
Contributors: James Pustejovsky (Advisor)
Awarding Institution: Brandeis University, Graduate School of Arts and Sciences; Master of Arts (MA)
Theses and Dissertations: Master of Arts (MA), Brandeis University, Graduate School of Arts and Sciences
Publisher: Brandeis University
Grant note: Brandeis University, Graduate School of Arts and Sciences
Identifiers: 10192/24530; 9923879965501921
Academic Unit: Michtom School of Computer Science
Language: English
Resource Type: Thesis

Identifying Technology Terms in Document Text

Abstract

Files and links (1)

Metrics

Details

Brandeis University Social media