Natural Language Processing (NLP) in Qualitative Public Health Research: A Proof of Concept Study
Qualitative data-analysis methods provide thick, rich descriptions of subjects’ thoughts, feelings, and lived experiences but may be time-consuming, labor-intensive, or prone to bias. Natural language processing (NLP) is a machine learning technique from computer science that uses algorithms to anal...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SAGE Publishing
2019-11-01
|
Series: | International Journal of Qualitative Methods |
Online Access: | https://doi.org/10.1177/1609406919887021 |
id |
doaj-f4f82df3e7c64e4fb3c5ae560650f806 |
---|---|
record_format |
Article |
spelling |
doaj-f4f82df3e7c64e4fb3c5ae560650f8062020-11-25T03:36:02ZengSAGE PublishingInternational Journal of Qualitative Methods1609-40692019-11-011810.1177/1609406919887021Natural Language Processing (NLP) in Qualitative Public Health Research: A Proof of Concept StudyWilliam Leeson0Adam Resnick1Daniel Alexander2John Rovers3 College of Arts & Sciences, Drake University, Des Moines, IA, USA College of Arts & Sciences, Drake University, Des Moines, IA, USA College of Arts & Sciences, Drake University, Des Moines, IA, USA College of Pharmacy & Health Sciences, Drake University, Des Moines, IA, USAQualitative data-analysis methods provide thick, rich descriptions of subjects’ thoughts, feelings, and lived experiences but may be time-consuming, labor-intensive, or prone to bias. Natural language processing (NLP) is a machine learning technique from computer science that uses algorithms to analyze textual data. NLP allows processing of large amounts of data almost instantaneously. As researchers become conversant with NLP, it is becoming more frequently employed outside of computer science and shows promise as a tool to analyze qualitative data in public health. This is a proof of concept paper to evaluate the potential of NLP to analyze qualitative data. Specifically, we ask if NLP can support conventional qualitative analysis, and if so, what its role is. We compared a qualitative method of open coding with two forms of NLP, Topic Modeling, and Word2Vec to analyze transcripts from interviews conducted in rural Belize querying men about their health needs. All three methods returned a series of terms that captured ideas and concepts in subjects’ responses to interview questions. Open coding returned 5–10 words or short phrases for each question. Topic Modeling returned a series of word-probability pairs that quantified how well a word captured the topic of a response. Word2Vec returned a list of words for each interview question ordered by which words were predicted to best capture the meaning of the passage. For most interview questions, all three methods returned conceptually similar results. NLP may be a useful adjunct to qualitative analysis. NLP may be performed after data have undergone open coding as a check on the accuracy of the codes. Alternatively, researchers can perform NLP prior to open coding and use the results to guide their creation of their codebook.https://doi.org/10.1177/1609406919887021 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
William Leeson Adam Resnick Daniel Alexander John Rovers |
spellingShingle |
William Leeson Adam Resnick Daniel Alexander John Rovers Natural Language Processing (NLP) in Qualitative Public Health Research: A Proof of Concept Study International Journal of Qualitative Methods |
author_facet |
William Leeson Adam Resnick Daniel Alexander John Rovers |
author_sort |
William Leeson |
title |
Natural Language Processing (NLP) in Qualitative Public Health Research: A Proof of Concept Study |
title_short |
Natural Language Processing (NLP) in Qualitative Public Health Research: A Proof of Concept Study |
title_full |
Natural Language Processing (NLP) in Qualitative Public Health Research: A Proof of Concept Study |
title_fullStr |
Natural Language Processing (NLP) in Qualitative Public Health Research: A Proof of Concept Study |
title_full_unstemmed |
Natural Language Processing (NLP) in Qualitative Public Health Research: A Proof of Concept Study |
title_sort |
natural language processing (nlp) in qualitative public health research: a proof of concept study |
publisher |
SAGE Publishing |
series |
International Journal of Qualitative Methods |
issn |
1609-4069 |
publishDate |
2019-11-01 |
description |
Qualitative data-analysis methods provide thick, rich descriptions of subjects’ thoughts, feelings, and lived experiences but may be time-consuming, labor-intensive, or prone to bias. Natural language processing (NLP) is a machine learning technique from computer science that uses algorithms to analyze textual data. NLP allows processing of large amounts of data almost instantaneously. As researchers become conversant with NLP, it is becoming more frequently employed outside of computer science and shows promise as a tool to analyze qualitative data in public health. This is a proof of concept paper to evaluate the potential of NLP to analyze qualitative data. Specifically, we ask if NLP can support conventional qualitative analysis, and if so, what its role is. We compared a qualitative method of open coding with two forms of NLP, Topic Modeling, and Word2Vec to analyze transcripts from interviews conducted in rural Belize querying men about their health needs. All three methods returned a series of terms that captured ideas and concepts in subjects’ responses to interview questions. Open coding returned 5–10 words or short phrases for each question. Topic Modeling returned a series of word-probability pairs that quantified how well a word captured the topic of a response. Word2Vec returned a list of words for each interview question ordered by which words were predicted to best capture the meaning of the passage. For most interview questions, all three methods returned conceptually similar results. NLP may be a useful adjunct to qualitative analysis. NLP may be performed after data have undergone open coding as a check on the accuracy of the codes. Alternatively, researchers can perform NLP prior to open coding and use the results to guide their creation of their codebook. |
url |
https://doi.org/10.1177/1609406919887021 |
work_keys_str_mv |
AT williamleeson naturallanguageprocessingnlpinqualitativepublichealthresearchaproofofconceptstudy AT adamresnick naturallanguageprocessingnlpinqualitativepublichealthresearchaproofofconceptstudy AT danielalexander naturallanguageprocessingnlpinqualitativepublichealthresearchaproofofconceptstudy AT johnrovers naturallanguageprocessingnlpinqualitativepublichealthresearchaproofofconceptstudy |
_version_ |
1724551712852672512 |