Identifying Adverse Drug Events in Twitter Data Using Semi-Supervised Bootstrapped Lexicons

Eric Benzschawel

doi:10.48617/etd.920

Back

Identifying Adverse Drug Events in Twitter Data Using Semi-Supervised Bootstrapped Lexicons

Thesis

Open access

Identifying Adverse Drug Events in Twitter Data Using Semi-Supervised Bootstrapped Lexicons

Eric Benzschawel

Brandeis University

Master of Arts (MA), Brandeis University, Graduate School of Arts and Sciences

2016

DOI:

https://doi.org/10.48617/etd.920

Handle:

https://hdl.handle.net/10192/32253

Abstract

natural language processing

social media

twitter

clinical NLP

pharmacovigilance

bootstrapping

lexicon-based techniques

ADE

ADR

adverse effect

Adverse drug event (ADE) detection serves as a primary quality of care benchmark in healthcare and plays a major role in pharmacovigilance. Social media data represents a largely untapped source of public clinical narratives which can be used to expand existing ADE tracking systems. Existing studies focus on annotation of small amounts of data to handle non-standard language usage. This study presents a new application of semi-supervised lexicon bootstrapping to flag Twitter data for potential ADEs. To do this, a new corpus sixteen times larger than the current largest, publicly available dataset was constructed and used to generate robust, bootstrapped drug and medical event lexicons. These lexicons were applied to held-out data to flag tweets containing potential ADEs. Compared to recent studies of lexicon-based ADE detection in Twitter, this method achieved competitive F1 scores and offers a robust evaluation capable of identifying severe ADEs in the social media sphere, representing important new data points relevant to existing pharmacovigilance systems.

Files and links (1)

pdf

BenzschawelThesis201626.95 MBDownload View

PDF Open Access

Metrics

417 File views/ downloads

75 Record Views

Details

Title: Identifying Adverse Drug Events in Twitter Data Using Semi-Supervised Bootstrapped Lexicons
Creators: Eric Benzschawel
Contributors: Nianwen Xue (Advisor)
Awarding Institution: Brandeis University, Graduate School of Arts and Sciences; Master of Arts (MA)
Theses and Dissertations: Master of Arts (MA), Brandeis University, Graduate School of Arts and Sciences
Publisher: Brandeis University
Grant note: Brandeis University, Graduate School of Arts and Sciences
Identifiers: 10192/32253; 9923879964601921
Academic Unit: Interdepartmental Program in Linguistics and Computational Linguistics
Language: English
Resource Type: Thesis

Identifying Adverse Drug Events in Twitter Data Using Semi-Supervised Bootstrapped Lexicons

Abstract

Files and links (1)

Metrics

Details

Brandeis University Social media