Classification of Health-Related Social Media Posts: Evaluation of Post Content–Classifier Models and Analysis of User Demographics

BackgroundThe increasing volume of health-related social media activity, where users connect, collaborate, and engage, has increased the significance of analyzing how people use health-related social media. ObjectiveThe aim of this study was to classify the conten...

Full description

Bibliographic Details
Main Authors: Rivas, Ryan, Sadah, Shouq A, Guo, Yuhang, Hristidis, Vagelis
Format: Article
Language:English
Published: JMIR Publications 2020-04-01
Series:JMIR Public Health and Surveillance
Online Access:https://publichealth.jmir.org/2020/2/e14952
id doaj-391694f5e8ad4b41bd635d1a07c4d3fe
record_format Article
spelling doaj-391694f5e8ad4b41bd635d1a07c4d3fe2021-05-03T02:53:41ZengJMIR PublicationsJMIR Public Health and Surveillance2369-29602020-04-0162e1495210.2196/14952Classification of Health-Related Social Media Posts: Evaluation of Post Content–Classifier Models and Analysis of User DemographicsRivas, RyanSadah, Shouq AGuo, YuhangHristidis, Vagelis BackgroundThe increasing volume of health-related social media activity, where users connect, collaborate, and engage, has increased the significance of analyzing how people use health-related social media. ObjectiveThe aim of this study was to classify the content (eg, posts that share experiences and seek support) of users who write health-related social media posts and study the effect of user demographics on post content. MethodsWe analyzed two different types of health-related social media: (1) health-related online forums—WebMD and DailyStrength—and (2) general online social networks—Twitter and Google+. We identified several categories of post content and built classifiers to automatically detect these categories. These classifiers were used to study the distribution of categories for various demographic groups. ResultsWe achieved an accuracy of at least 84% and a balanced accuracy of at least 0.81 for half of the post content categories in our experiments. In addition, 70.04% (4741/6769) of posts by male WebMD users asked for advice, and male users’ WebMD posts were more likely to ask for medical advice than female users’ posts. The majority of posts on DailyStrength shared experiences, regardless of the gender, age group, or location of their authors. Furthermore, health-related posts on Twitter and Google+ were used to share experiences less frequently than posts on WebMD and DailyStrength. ConclusionsWe studied and analyzed the content of health-related social media posts. Our results can guide health advocates and researchers to better target patient populations based on the application type. Given a research question or an outreach goal, our results can be used to choose the best online forums to answer the question or disseminate a message.https://publichealth.jmir.org/2020/2/e14952
collection DOAJ
language English
format Article
sources DOAJ
author Rivas, Ryan
Sadah, Shouq A
Guo, Yuhang
Hristidis, Vagelis
spellingShingle Rivas, Ryan
Sadah, Shouq A
Guo, Yuhang
Hristidis, Vagelis
Classification of Health-Related Social Media Posts: Evaluation of Post Content–Classifier Models and Analysis of User Demographics
JMIR Public Health and Surveillance
author_facet Rivas, Ryan
Sadah, Shouq A
Guo, Yuhang
Hristidis, Vagelis
author_sort Rivas, Ryan
title Classification of Health-Related Social Media Posts: Evaluation of Post Content–Classifier Models and Analysis of User Demographics
title_short Classification of Health-Related Social Media Posts: Evaluation of Post Content–Classifier Models and Analysis of User Demographics
title_full Classification of Health-Related Social Media Posts: Evaluation of Post Content–Classifier Models and Analysis of User Demographics
title_fullStr Classification of Health-Related Social Media Posts: Evaluation of Post Content–Classifier Models and Analysis of User Demographics
title_full_unstemmed Classification of Health-Related Social Media Posts: Evaluation of Post Content–Classifier Models and Analysis of User Demographics
title_sort classification of health-related social media posts: evaluation of post content–classifier models and analysis of user demographics
publisher JMIR Publications
series JMIR Public Health and Surveillance
issn 2369-2960
publishDate 2020-04-01
description BackgroundThe increasing volume of health-related social media activity, where users connect, collaborate, and engage, has increased the significance of analyzing how people use health-related social media. ObjectiveThe aim of this study was to classify the content (eg, posts that share experiences and seek support) of users who write health-related social media posts and study the effect of user demographics on post content. MethodsWe analyzed two different types of health-related social media: (1) health-related online forums—WebMD and DailyStrength—and (2) general online social networks—Twitter and Google+. We identified several categories of post content and built classifiers to automatically detect these categories. These classifiers were used to study the distribution of categories for various demographic groups. ResultsWe achieved an accuracy of at least 84% and a balanced accuracy of at least 0.81 for half of the post content categories in our experiments. In addition, 70.04% (4741/6769) of posts by male WebMD users asked for advice, and male users’ WebMD posts were more likely to ask for medical advice than female users’ posts. The majority of posts on DailyStrength shared experiences, regardless of the gender, age group, or location of their authors. Furthermore, health-related posts on Twitter and Google+ were used to share experiences less frequently than posts on WebMD and DailyStrength. ConclusionsWe studied and analyzed the content of health-related social media posts. Our results can guide health advocates and researchers to better target patient populations based on the application type. Given a research question or an outreach goal, our results can be used to choose the best online forums to answer the question or disseminate a message.
url https://publichealth.jmir.org/2020/2/e14952
work_keys_str_mv AT rivasryan classificationofhealthrelatedsocialmediapostsevaluationofpostcontentclassifiermodelsandanalysisofuserdemographics
AT sadahshouqa classificationofhealthrelatedsocialmediapostsevaluationofpostcontentclassifiermodelsandanalysisofuserdemographics
AT guoyuhang classificationofhealthrelatedsocialmediapostsevaluationofpostcontentclassifiermodelsandanalysisofuserdemographics
AT hristidisvagelis classificationofhealthrelatedsocialmediapostsevaluationofpostcontentclassifiermodelsandanalysisofuserdemographics
_version_ 1721484951014080512