Statistical Text Analysis for Social Science

What can text corpora tell us about society? How can automatic text analysis algorithms efficiently and reliably analyze the social processes revealed in language production? This work develops statistical text analyses of dynamic social and news media datasets to extract indicators of underlying so...

Full description

Bibliographic Details
Main Author:	O'Connor, Brendan T.
Format:	Others
Published:	Research Showcase @ CMU 2014
Subjects:	computational social science natural language processing text mining quantitative text analysis machine learning probabilistic graphical models
Online Access:	http://repository.cmu.edu/dissertations/541 http://repository.cmu.edu/cgi/viewcontent.cgi?article=1574&context=dissertations

id	ndltd-cmu.edu-oai-repository.cmu.edu-dissertations-1574
record_format	oai_dc
spelling	ndltd-cmu.edu-oai-repository.cmu.edu-dissertations-15742015-11-13T03:24:45Z Statistical Text Analysis for Social Science O'Connor, Brendan T. What can text corpora tell us about society? How can automatic text analysis algorithms efficiently and reliably analyze the social processes revealed in language production? This work develops statistical text analyses of dynamic social and news media datasets to extract indicators of underlying social phenomena, and to reveal how social factors guide linguistic production. This is illustrated through three case studies: first, examining whether sentiment expressed in social media can track opinion polls on economic and political topics (Chapter 3); second, analyzing how novel online slang terms can be very specific to geographic and demographic communities, and how these social factors affect their transmission over time (Chapters 4 and 5); and third, automatically extracting political events from news articles, to assist analyses of the interactions of international actors over time (Chapter 6). We demonstrate a variety of computational, linguistic, and statistical tools that are employed for these analyses, and also contribute MiTextExplorer, an interactive system for exploratory analysis of text data against document covariates, whose design was informed by the experience of researching these and other similar works (Chapter 2). These case studies illustrate recurring themes toward developing text analysis as a social science methodology: computational and statistical complexity, and domain knowledge and linguistic assumptions. 2014-08-01T07:00:00Z text application/pdf http://repository.cmu.edu/dissertations/541 http://repository.cmu.edu/cgi/viewcontent.cgi?article=1574&context=dissertations Dissertations Research Showcase @ CMU computational social science natural language processing text mining quantitative text analysis machine learning probabilistic graphical models
collection	NDLTD
format	Others
sources	NDLTD
topic	computational social science natural language processing text mining quantitative text analysis machine learning probabilistic graphical models
spellingShingle	computational social science natural language processing text mining quantitative text analysis machine learning probabilistic graphical models O'Connor, Brendan T. Statistical Text Analysis for Social Science
description	What can text corpora tell us about society? How can automatic text analysis algorithms efficiently and reliably analyze the social processes revealed in language production? This work develops statistical text analyses of dynamic social and news media datasets to extract indicators of underlying social phenomena, and to reveal how social factors guide linguistic production. This is illustrated through three case studies: first, examining whether sentiment expressed in social media can track opinion polls on economic and political topics (Chapter 3); second, analyzing how novel online slang terms can be very specific to geographic and demographic communities, and how these social factors affect their transmission over time (Chapters 4 and 5); and third, automatically extracting political events from news articles, to assist analyses of the interactions of international actors over time (Chapter 6). We demonstrate a variety of computational, linguistic, and statistical tools that are employed for these analyses, and also contribute MiTextExplorer, an interactive system for exploratory analysis of text data against document covariates, whose design was informed by the experience of researching these and other similar works (Chapter 2). These case studies illustrate recurring themes toward developing text analysis as a social science methodology: computational and statistical complexity, and domain knowledge and linguistic assumptions.
author	O'Connor, Brendan T.
author_facet	O'Connor, Brendan T.
author_sort	O'Connor, Brendan T.
title	Statistical Text Analysis for Social Science
title_short	Statistical Text Analysis for Social Science
title_full	Statistical Text Analysis for Social Science
title_fullStr	Statistical Text Analysis for Social Science
title_full_unstemmed	Statistical Text Analysis for Social Science
title_sort	statistical text analysis for social science
publisher	Research Showcase @ CMU
publishDate	2014
url	http://repository.cmu.edu/dissertations/541 http://repository.cmu.edu/cgi/viewcontent.cgi?article=1574&context=dissertations
work_keys_str_mv	AT oconnorbrendant statisticaltextanalysisforsocialscience
_version_	1718127899170570240

Statistical Text Analysis for Social Science

Similar Items