An exploration of text mining of narrative reports of injury incidents to assess risk
A topic model was explored using unsupervised machine learning to summarized free-text narrative reports of 77,215 injuries that occurred in coal mines in the USA between 2000 and 2015. Latent Dirichlet Allocation modeling processes identified six topics from the free-text data. One topic, a theme d...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
EDP Sciences
2018-01-01
|
Series: | MATEC Web of Conferences |
Online Access: | https://doi.org/10.1051/matecconf/201825106020 |
id |
doaj-8553ed17432342e2aeedfc77c8101999 |
---|---|
record_format |
Article |
spelling |
doaj-8553ed17432342e2aeedfc77c81019992021-02-02T00:20:25ZengEDP SciencesMATEC Web of Conferences2261-236X2018-01-012510602010.1051/matecconf/201825106020matecconf_ipicse2018_06020An exploration of text mining of narrative reports of injury incidents to assess riskPassmore David0Chae Chungil1Kustikova Yulia2Baker Rose3Yim Jeong-Ha4Penn State University, Workforce Education and DevelopmentPenn State University, Applied Cognitive Science LabFederal State Educational Institution of Higher Education, National Research Moscow State University of Civil EngineeringUniversity of North Texas, Learning TechnologiesUniversity of Georgia, Lifelong Education, Administration, and PolicyA topic model was explored using unsupervised machine learning to summarized free-text narrative reports of 77,215 injuries that occurred in coal mines in the USA between 2000 and 2015. Latent Dirichlet Allocation modeling processes identified six topics from the free-text data. One topic, a theme describing primarily injury incidents resulting in strains and sprains of musculoskeletal systems, revealed differences in topic emphasis by the location of the mine property at which injuries occurred, the degree of injury, and the year of injury occurrence. Text narratives clustered around this topic refer most frequently to surface or other locations rather than underground locations that resulted in disability and that, also, increased secularly over time. The modeling success enjoyed in this exploratory effort suggests that additional topic mining of these injury text narratives is justified, especially using a broad set of covariates to explain variations in topic emphasis and for comparison of surface mining injuries with injuries occurring during site preparation for construction.https://doi.org/10.1051/matecconf/201825106020 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Passmore David Chae Chungil Kustikova Yulia Baker Rose Yim Jeong-Ha |
spellingShingle |
Passmore David Chae Chungil Kustikova Yulia Baker Rose Yim Jeong-Ha An exploration of text mining of narrative reports of injury incidents to assess risk MATEC Web of Conferences |
author_facet |
Passmore David Chae Chungil Kustikova Yulia Baker Rose Yim Jeong-Ha |
author_sort |
Passmore David |
title |
An exploration of text mining of narrative reports of injury incidents to assess risk |
title_short |
An exploration of text mining of narrative reports of injury incidents to assess risk |
title_full |
An exploration of text mining of narrative reports of injury incidents to assess risk |
title_fullStr |
An exploration of text mining of narrative reports of injury incidents to assess risk |
title_full_unstemmed |
An exploration of text mining of narrative reports of injury incidents to assess risk |
title_sort |
exploration of text mining of narrative reports of injury incidents to assess risk |
publisher |
EDP Sciences |
series |
MATEC Web of Conferences |
issn |
2261-236X |
publishDate |
2018-01-01 |
description |
A topic model was explored using unsupervised machine learning to summarized free-text narrative reports of 77,215 injuries that occurred in coal mines in the USA between 2000 and 2015. Latent Dirichlet Allocation modeling processes identified six topics from the free-text data. One topic, a theme describing primarily injury incidents resulting in strains and sprains of musculoskeletal systems, revealed differences in topic emphasis by the location of the mine property at which injuries occurred, the degree of injury, and the year of injury occurrence. Text narratives clustered around this topic refer most frequently to surface or other locations rather than underground locations that resulted in disability and that, also, increased secularly over time. The modeling success enjoyed in this exploratory effort suggests that additional topic mining of these injury text narratives is justified, especially using a broad set of covariates to explain variations in topic emphasis and for comparison of surface mining injuries with injuries occurring during site preparation for construction. |
url |
https://doi.org/10.1051/matecconf/201825106020 |
work_keys_str_mv |
AT passmoredavid anexplorationoftextminingofnarrativereportsofinjuryincidentstoassessrisk AT chaechungil anexplorationoftextminingofnarrativereportsofinjuryincidentstoassessrisk AT kustikovayulia anexplorationoftextminingofnarrativereportsofinjuryincidentstoassessrisk AT bakerrose anexplorationoftextminingofnarrativereportsofinjuryincidentstoassessrisk AT yimjeongha anexplorationoftextminingofnarrativereportsofinjuryincidentstoassessrisk AT passmoredavid explorationoftextminingofnarrativereportsofinjuryincidentstoassessrisk AT chaechungil explorationoftextminingofnarrativereportsofinjuryincidentstoassessrisk AT kustikovayulia explorationoftextminingofnarrativereportsofinjuryincidentstoassessrisk AT bakerrose explorationoftextminingofnarrativereportsofinjuryincidentstoassessrisk AT yimjeongha explorationoftextminingofnarrativereportsofinjuryincidentstoassessrisk |
_version_ |
1724314063454863360 |