An exploration of text mining of narrative reports of injury incidents to assess risk

A topic model was explored using unsupervised machine learning to summarized free-text narrative reports of 77,215 injuries that occurred in coal mines in the USA between 2000 and 2015. Latent Dirichlet Allocation modeling processes identified six topics from the free-text data. One topic, a theme d...

Full description

Bibliographic Details
Main Authors: Passmore David, Chae Chungil, Kustikova Yulia, Baker Rose, Yim Jeong-Ha
Format: Article
Language:English
Published: EDP Sciences 2018-01-01
Series:MATEC Web of Conferences
Online Access:https://doi.org/10.1051/matecconf/201825106020
id doaj-8553ed17432342e2aeedfc77c8101999
record_format Article
spelling doaj-8553ed17432342e2aeedfc77c81019992021-02-02T00:20:25ZengEDP SciencesMATEC Web of Conferences2261-236X2018-01-012510602010.1051/matecconf/201825106020matecconf_ipicse2018_06020An exploration of text mining of narrative reports of injury incidents to assess riskPassmore David0Chae Chungil1Kustikova Yulia2Baker Rose3Yim Jeong-Ha4Penn State University, Workforce Education and DevelopmentPenn State University, Applied Cognitive Science LabFederal State Educational Institution of Higher Education, National Research Moscow State University of Civil EngineeringUniversity of North Texas, Learning TechnologiesUniversity of Georgia, Lifelong Education, Administration, and PolicyA topic model was explored using unsupervised machine learning to summarized free-text narrative reports of 77,215 injuries that occurred in coal mines in the USA between 2000 and 2015. Latent Dirichlet Allocation modeling processes identified six topics from the free-text data. One topic, a theme describing primarily injury incidents resulting in strains and sprains of musculoskeletal systems, revealed differences in topic emphasis by the location of the mine property at which injuries occurred, the degree of injury, and the year of injury occurrence. Text narratives clustered around this topic refer most frequently to surface or other locations rather than underground locations that resulted in disability and that, also, increased secularly over time. The modeling success enjoyed in this exploratory effort suggests that additional topic mining of these injury text narratives is justified, especially using a broad set of covariates to explain variations in topic emphasis and for comparison of surface mining injuries with injuries occurring during site preparation for construction.https://doi.org/10.1051/matecconf/201825106020
collection DOAJ
language English
format Article
sources DOAJ
author Passmore David
Chae Chungil
Kustikova Yulia
Baker Rose
Yim Jeong-Ha
spellingShingle Passmore David
Chae Chungil
Kustikova Yulia
Baker Rose
Yim Jeong-Ha
An exploration of text mining of narrative reports of injury incidents to assess risk
MATEC Web of Conferences
author_facet Passmore David
Chae Chungil
Kustikova Yulia
Baker Rose
Yim Jeong-Ha
author_sort Passmore David
title An exploration of text mining of narrative reports of injury incidents to assess risk
title_short An exploration of text mining of narrative reports of injury incidents to assess risk
title_full An exploration of text mining of narrative reports of injury incidents to assess risk
title_fullStr An exploration of text mining of narrative reports of injury incidents to assess risk
title_full_unstemmed An exploration of text mining of narrative reports of injury incidents to assess risk
title_sort exploration of text mining of narrative reports of injury incidents to assess risk
publisher EDP Sciences
series MATEC Web of Conferences
issn 2261-236X
publishDate 2018-01-01
description A topic model was explored using unsupervised machine learning to summarized free-text narrative reports of 77,215 injuries that occurred in coal mines in the USA between 2000 and 2015. Latent Dirichlet Allocation modeling processes identified six topics from the free-text data. One topic, a theme describing primarily injury incidents resulting in strains and sprains of musculoskeletal systems, revealed differences in topic emphasis by the location of the mine property at which injuries occurred, the degree of injury, and the year of injury occurrence. Text narratives clustered around this topic refer most frequently to surface or other locations rather than underground locations that resulted in disability and that, also, increased secularly over time. The modeling success enjoyed in this exploratory effort suggests that additional topic mining of these injury text narratives is justified, especially using a broad set of covariates to explain variations in topic emphasis and for comparison of surface mining injuries with injuries occurring during site preparation for construction.
url https://doi.org/10.1051/matecconf/201825106020
work_keys_str_mv AT passmoredavid anexplorationoftextminingofnarrativereportsofinjuryincidentstoassessrisk
AT chaechungil anexplorationoftextminingofnarrativereportsofinjuryincidentstoassessrisk
AT kustikovayulia anexplorationoftextminingofnarrativereportsofinjuryincidentstoassessrisk
AT bakerrose anexplorationoftextminingofnarrativereportsofinjuryincidentstoassessrisk
AT yimjeongha anexplorationoftextminingofnarrativereportsofinjuryincidentstoassessrisk
AT passmoredavid explorationoftextminingofnarrativereportsofinjuryincidentstoassessrisk
AT chaechungil explorationoftextminingofnarrativereportsofinjuryincidentstoassessrisk
AT kustikovayulia explorationoftextminingofnarrativereportsofinjuryincidentstoassessrisk
AT bakerrose explorationoftextminingofnarrativereportsofinjuryincidentstoassessrisk
AT yimjeongha explorationoftextminingofnarrativereportsofinjuryincidentstoassessrisk
_version_ 1724314063454863360