Natural Language Processing (NLP) in Qualitative Public Health Research: A Proof of Concept Study

Qualitative data-analysis methods provide thick, rich descriptions of subjects’ thoughts, feelings, and lived experiences but may be time-consuming, labor-intensive, or prone to bias. Natural language processing (NLP) is a machine learning technique from computer science that uses algorithms to anal...

Full description

Bibliographic Details
Main Authors: William Leeson, Adam Resnick, Daniel Alexander, John Rovers
Format: Article
Language:English
Published: SAGE Publishing 2019-11-01
Series:International Journal of Qualitative Methods
Online Access:https://doi.org/10.1177/1609406919887021
id doaj-f4f82df3e7c64e4fb3c5ae560650f806
record_format Article
spelling doaj-f4f82df3e7c64e4fb3c5ae560650f8062020-11-25T03:36:02ZengSAGE PublishingInternational Journal of Qualitative Methods1609-40692019-11-011810.1177/1609406919887021Natural Language Processing (NLP) in Qualitative Public Health Research: A Proof of Concept StudyWilliam Leeson0Adam Resnick1Daniel Alexander2John Rovers3 College of Arts & Sciences, Drake University, Des Moines, IA, USA College of Arts & Sciences, Drake University, Des Moines, IA, USA College of Arts & Sciences, Drake University, Des Moines, IA, USA College of Pharmacy & Health Sciences, Drake University, Des Moines, IA, USAQualitative data-analysis methods provide thick, rich descriptions of subjects’ thoughts, feelings, and lived experiences but may be time-consuming, labor-intensive, or prone to bias. Natural language processing (NLP) is a machine learning technique from computer science that uses algorithms to analyze textual data. NLP allows processing of large amounts of data almost instantaneously. As researchers become conversant with NLP, it is becoming more frequently employed outside of computer science and shows promise as a tool to analyze qualitative data in public health. This is a proof of concept paper to evaluate the potential of NLP to analyze qualitative data. Specifically, we ask if NLP can support conventional qualitative analysis, and if so, what its role is. We compared a qualitative method of open coding with two forms of NLP, Topic Modeling, and Word2Vec to analyze transcripts from interviews conducted in rural Belize querying men about their health needs. All three methods returned a series of terms that captured ideas and concepts in subjects’ responses to interview questions. Open coding returned 5–10 words or short phrases for each question. Topic Modeling returned a series of word-probability pairs that quantified how well a word captured the topic of a response. Word2Vec returned a list of words for each interview question ordered by which words were predicted to best capture the meaning of the passage. For most interview questions, all three methods returned conceptually similar results. NLP may be a useful adjunct to qualitative analysis. NLP may be performed after data have undergone open coding as a check on the accuracy of the codes. Alternatively, researchers can perform NLP prior to open coding and use the results to guide their creation of their codebook.https://doi.org/10.1177/1609406919887021
collection DOAJ
language English
format Article
sources DOAJ
author William Leeson
Adam Resnick
Daniel Alexander
John Rovers
spellingShingle William Leeson
Adam Resnick
Daniel Alexander
John Rovers
Natural Language Processing (NLP) in Qualitative Public Health Research: A Proof of Concept Study
International Journal of Qualitative Methods
author_facet William Leeson
Adam Resnick
Daniel Alexander
John Rovers
author_sort William Leeson
title Natural Language Processing (NLP) in Qualitative Public Health Research: A Proof of Concept Study
title_short Natural Language Processing (NLP) in Qualitative Public Health Research: A Proof of Concept Study
title_full Natural Language Processing (NLP) in Qualitative Public Health Research: A Proof of Concept Study
title_fullStr Natural Language Processing (NLP) in Qualitative Public Health Research: A Proof of Concept Study
title_full_unstemmed Natural Language Processing (NLP) in Qualitative Public Health Research: A Proof of Concept Study
title_sort natural language processing (nlp) in qualitative public health research: a proof of concept study
publisher SAGE Publishing
series International Journal of Qualitative Methods
issn 1609-4069
publishDate 2019-11-01
description Qualitative data-analysis methods provide thick, rich descriptions of subjects’ thoughts, feelings, and lived experiences but may be time-consuming, labor-intensive, or prone to bias. Natural language processing (NLP) is a machine learning technique from computer science that uses algorithms to analyze textual data. NLP allows processing of large amounts of data almost instantaneously. As researchers become conversant with NLP, it is becoming more frequently employed outside of computer science and shows promise as a tool to analyze qualitative data in public health. This is a proof of concept paper to evaluate the potential of NLP to analyze qualitative data. Specifically, we ask if NLP can support conventional qualitative analysis, and if so, what its role is. We compared a qualitative method of open coding with two forms of NLP, Topic Modeling, and Word2Vec to analyze transcripts from interviews conducted in rural Belize querying men about their health needs. All three methods returned a series of terms that captured ideas and concepts in subjects’ responses to interview questions. Open coding returned 5–10 words or short phrases for each question. Topic Modeling returned a series of word-probability pairs that quantified how well a word captured the topic of a response. Word2Vec returned a list of words for each interview question ordered by which words were predicted to best capture the meaning of the passage. For most interview questions, all three methods returned conceptually similar results. NLP may be a useful adjunct to qualitative analysis. NLP may be performed after data have undergone open coding as a check on the accuracy of the codes. Alternatively, researchers can perform NLP prior to open coding and use the results to guide their creation of their codebook.
url https://doi.org/10.1177/1609406919887021
work_keys_str_mv AT williamleeson naturallanguageprocessingnlpinqualitativepublichealthresearchaproofofconceptstudy
AT adamresnick naturallanguageprocessingnlpinqualitativepublichealthresearchaproofofconceptstudy
AT danielalexander naturallanguageprocessingnlpinqualitativepublichealthresearchaproofofconceptstudy
AT johnrovers naturallanguageprocessingnlpinqualitativepublichealthresearchaproofofconceptstudy
_version_ 1724551712852672512