Evaluating Sampling Methods for Content Analysis of Twitter Data

Despite the existing evaluation of the sampling options for periodical media content, only a few empirical studies have examined whether probability sampling methods can be applicable to social media content other than simple random sampling. This article tests the efficiency of simple random sampli...

Full description

Bibliographic Details
Main Authors: Hwalbin Kim, S. Mo Jang, Sei-Hill Kim, Anan Wan
Format: Article
Language:English
Published: SAGE Publishing 2018-04-01
Series:Social Media + Society
Online Access:https://doi.org/10.1177/2056305118772836
id doaj-de8de83b9f3e4cb4aee3bf972d5f5968
record_format Article
spelling doaj-de8de83b9f3e4cb4aee3bf972d5f59682020-11-25T02:53:59ZengSAGE PublishingSocial Media + Society2056-30512018-04-01410.1177/2056305118772836Evaluating Sampling Methods for Content Analysis of Twitter DataHwalbin Kim0S. Mo Jang1Sei-Hill Kim2Anan Wan3Hallym University, Republic of KoreaUniversity of South Carolina, USAUniversity of South Carolina, USAUniversity of South Carolina, USADespite the existing evaluation of the sampling options for periodical media content, only a few empirical studies have examined whether probability sampling methods can be applicable to social media content other than simple random sampling. This article tests the efficiency of simple random sampling and constructed week sampling, by varying the sample size of Twitter content related to the 2014 South Carolina gubernatorial election. We examine how many weeks were needed to adequately represent 5 months of tweets. Our findings show that a simple random sampling is more efficient than a constructed week sampling in terms of obtaining a more efficient and representative sample of Twitter data. This study also suggests that it is necessary to produce a sufficient sample size when analyzing social media content.https://doi.org/10.1177/2056305118772836
collection DOAJ
language English
format Article
sources DOAJ
author Hwalbin Kim
S. Mo Jang
Sei-Hill Kim
Anan Wan
spellingShingle Hwalbin Kim
S. Mo Jang
Sei-Hill Kim
Anan Wan
Evaluating Sampling Methods for Content Analysis of Twitter Data
Social Media + Society
author_facet Hwalbin Kim
S. Mo Jang
Sei-Hill Kim
Anan Wan
author_sort Hwalbin Kim
title Evaluating Sampling Methods for Content Analysis of Twitter Data
title_short Evaluating Sampling Methods for Content Analysis of Twitter Data
title_full Evaluating Sampling Methods for Content Analysis of Twitter Data
title_fullStr Evaluating Sampling Methods for Content Analysis of Twitter Data
title_full_unstemmed Evaluating Sampling Methods for Content Analysis of Twitter Data
title_sort evaluating sampling methods for content analysis of twitter data
publisher SAGE Publishing
series Social Media + Society
issn 2056-3051
publishDate 2018-04-01
description Despite the existing evaluation of the sampling options for periodical media content, only a few empirical studies have examined whether probability sampling methods can be applicable to social media content other than simple random sampling. This article tests the efficiency of simple random sampling and constructed week sampling, by varying the sample size of Twitter content related to the 2014 South Carolina gubernatorial election. We examine how many weeks were needed to adequately represent 5 months of tweets. Our findings show that a simple random sampling is more efficient than a constructed week sampling in terms of obtaining a more efficient and representative sample of Twitter data. This study also suggests that it is necessary to produce a sufficient sample size when analyzing social media content.
url https://doi.org/10.1177/2056305118772836
work_keys_str_mv AT hwalbinkim evaluatingsamplingmethodsforcontentanalysisoftwitterdata
AT smojang evaluatingsamplingmethodsforcontentanalysisoftwitterdata
AT seihillkim evaluatingsamplingmethodsforcontentanalysisoftwitterdata
AT ananwan evaluatingsamplingmethodsforcontentanalysisoftwitterdata
_version_ 1724723224400363520