A corpus for mining drug-related knowledge from Twitter chatter: Language models and their utilities

In this data article, we present to the data science, natural language processing and public heath communities an unlabeled corpus and a set of language models. We collected the data from Twitter using drug names as keywords, including their common misspelled forms. Using this data, which is rich in...

Full description

Bibliographic Details
Main Authors: Abeed Sarker, Graciela Gonzalez
Format: Article
Language:English
Published: Elsevier 2017-02-01
Series:Data in Brief
Online Access:http://www.sciencedirect.com/science/article/pii/S2352340916307168