Identification and characterization of tweets related to the 2015 Indiana HIV outbreak: A retrospective infoveillance study.

<h4>Introduction</h4>From late 2014 through 2015, Scott County, Indiana faced an HIV outbreak triggered by opioid abuse and transition to injection drug use. Investigating the origins, risk factors, and responses related to this outbreak is critical to inform future surveillance, interve...

Full description

Bibliographic Details
Main Authors: Mingxiang Cai, Neal Shah, Jiawei Li, Wen-Hao Chen, Raphael E Cuomo, Nick Obradovich, Tim K Mackey
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2020-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0235150
id doaj-635646149ca14c3186429a373cb17684
record_format Article
spelling doaj-635646149ca14c3186429a373cb176842021-03-04T12:46:40ZengPublic Library of Science (PLoS)PLoS ONE1932-62032020-01-01158e023515010.1371/journal.pone.0235150Identification and characterization of tweets related to the 2015 Indiana HIV outbreak: A retrospective infoveillance study.Mingxiang CaiNeal ShahJiawei LiWen-Hao ChenRaphael E CuomoNick ObradovichTim K Mackey<h4>Introduction</h4>From late 2014 through 2015, Scott County, Indiana faced an HIV outbreak triggered by opioid abuse and transition to injection drug use. Investigating the origins, risk factors, and responses related to this outbreak is critical to inform future surveillance, interventions, and policymaking. In response, this retrospective infoveillance study identifies and characterizes user-generated messages related to opioid abuse, heroin injection drug use, and HIV status using natural language processing (NLP) among Twitter users in Indiana during the period of this HIV outbreak.<h4>Materials and methods</h4>Our study consisted of two phases: data collection and processing, and data analysis. We collected Indiana geolocated tweets from the public Twitter API using Amazon Web Services EC2 instances filtered for geocoded messages in the immediate pre and post period of the outbreak. In the data analysis phase we applied an unsupervised machine learning approach using NLP called the Biterm Topic Model (BTM) to identify tweets related to opioid, heroin/injection, and HIV behavior and then examined these messages for HIV risk-related topics that could be associated with the outbreak.<h4>Results</h4>More than 10 million geocoded tweets occurring in Indiana during the immediate pre and post period of the outbreak were collected for analysis. Using BTM, we identified 1350 tweets thought to be relevant to the outbreak and then confirmed 358 tweets using human annotation. The most prevalent themes identified were tweets related to self-reported abuse of illicit and prescription drugs, opioid use disorder, self-reported HIV status, and public sentiment regarding the outbreak. Geospatial analysis found that these messages clustered in population dense areas outside of the outbreak, including Indianapolis and neighboring Clark County.<h4>Discussion</h4>This infoveillance study characterized the social media conversations of communities in Indiana in the pre and post period of the 2015 HIV outbreak. Behavioral themes detected reflect discussion about risk factors related to HIV transmission stemming from opioid and heroin abuse for priority populations, and also help identify community attitudes that could have motivated or detracted the use of HIV prevention methods, along with helping identify factors that can impede access to prevention services.<h4>Conclusions</h4>Infoveillance approaches, such as the analysis conducted in this study, represent a possibly strategy to detect "signal" of the emergence of risk factors associated with an outbreak though may be limited in their scope and generalizability. Our results, in conjunction with other forms of public health surveillance, can leverage the growing ubiquity of social media platforms to better detect opioid-related HIV risk knowledge, attitudes and behavior, as well as inform future prevention efforts.https://doi.org/10.1371/journal.pone.0235150
collection DOAJ
language English
format Article
sources DOAJ
author Mingxiang Cai
Neal Shah
Jiawei Li
Wen-Hao Chen
Raphael E Cuomo
Nick Obradovich
Tim K Mackey
spellingShingle Mingxiang Cai
Neal Shah
Jiawei Li
Wen-Hao Chen
Raphael E Cuomo
Nick Obradovich
Tim K Mackey
Identification and characterization of tweets related to the 2015 Indiana HIV outbreak: A retrospective infoveillance study.
PLoS ONE
author_facet Mingxiang Cai
Neal Shah
Jiawei Li
Wen-Hao Chen
Raphael E Cuomo
Nick Obradovich
Tim K Mackey
author_sort Mingxiang Cai
title Identification and characterization of tweets related to the 2015 Indiana HIV outbreak: A retrospective infoveillance study.
title_short Identification and characterization of tweets related to the 2015 Indiana HIV outbreak: A retrospective infoveillance study.
title_full Identification and characterization of tweets related to the 2015 Indiana HIV outbreak: A retrospective infoveillance study.
title_fullStr Identification and characterization of tweets related to the 2015 Indiana HIV outbreak: A retrospective infoveillance study.
title_full_unstemmed Identification and characterization of tweets related to the 2015 Indiana HIV outbreak: A retrospective infoveillance study.
title_sort identification and characterization of tweets related to the 2015 indiana hiv outbreak: a retrospective infoveillance study.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2020-01-01
description <h4>Introduction</h4>From late 2014 through 2015, Scott County, Indiana faced an HIV outbreak triggered by opioid abuse and transition to injection drug use. Investigating the origins, risk factors, and responses related to this outbreak is critical to inform future surveillance, interventions, and policymaking. In response, this retrospective infoveillance study identifies and characterizes user-generated messages related to opioid abuse, heroin injection drug use, and HIV status using natural language processing (NLP) among Twitter users in Indiana during the period of this HIV outbreak.<h4>Materials and methods</h4>Our study consisted of two phases: data collection and processing, and data analysis. We collected Indiana geolocated tweets from the public Twitter API using Amazon Web Services EC2 instances filtered for geocoded messages in the immediate pre and post period of the outbreak. In the data analysis phase we applied an unsupervised machine learning approach using NLP called the Biterm Topic Model (BTM) to identify tweets related to opioid, heroin/injection, and HIV behavior and then examined these messages for HIV risk-related topics that could be associated with the outbreak.<h4>Results</h4>More than 10 million geocoded tweets occurring in Indiana during the immediate pre and post period of the outbreak were collected for analysis. Using BTM, we identified 1350 tweets thought to be relevant to the outbreak and then confirmed 358 tweets using human annotation. The most prevalent themes identified were tweets related to self-reported abuse of illicit and prescription drugs, opioid use disorder, self-reported HIV status, and public sentiment regarding the outbreak. Geospatial analysis found that these messages clustered in population dense areas outside of the outbreak, including Indianapolis and neighboring Clark County.<h4>Discussion</h4>This infoveillance study characterized the social media conversations of communities in Indiana in the pre and post period of the 2015 HIV outbreak. Behavioral themes detected reflect discussion about risk factors related to HIV transmission stemming from opioid and heroin abuse for priority populations, and also help identify community attitudes that could have motivated or detracted the use of HIV prevention methods, along with helping identify factors that can impede access to prevention services.<h4>Conclusions</h4>Infoveillance approaches, such as the analysis conducted in this study, represent a possibly strategy to detect "signal" of the emergence of risk factors associated with an outbreak though may be limited in their scope and generalizability. Our results, in conjunction with other forms of public health surveillance, can leverage the growing ubiquity of social media platforms to better detect opioid-related HIV risk knowledge, attitudes and behavior, as well as inform future prevention efforts.
url https://doi.org/10.1371/journal.pone.0235150
work_keys_str_mv AT mingxiangcai identificationandcharacterizationoftweetsrelatedtothe2015indianahivoutbreakaretrospectiveinfoveillancestudy
AT nealshah identificationandcharacterizationoftweetsrelatedtothe2015indianahivoutbreakaretrospectiveinfoveillancestudy
AT jiaweili identificationandcharacterizationoftweetsrelatedtothe2015indianahivoutbreakaretrospectiveinfoveillancestudy
AT wenhaochen identificationandcharacterizationoftweetsrelatedtothe2015indianahivoutbreakaretrospectiveinfoveillancestudy
AT raphaelecuomo identificationandcharacterizationoftweetsrelatedtothe2015indianahivoutbreakaretrospectiveinfoveillancestudy
AT nickobradovich identificationandcharacterizationoftweetsrelatedtothe2015indianahivoutbreakaretrospectiveinfoveillancestudy
AT timkmackey identificationandcharacterizationoftweetsrelatedtothe2015indianahivoutbreakaretrospectiveinfoveillancestudy
_version_ 1714801551164309504