Effectiveness of Natural Language Processing Based Machine Learning in Analyzing Incident Narratives at a Mine

To achieve the goal of preventing serious injuries and fatalities, it is important for a mine site to analyze site specific mine safety data. The advances in natural language processing (NLP) create an opportunity to develop machine learning (ML) tools to automate analysis of mine health and safety...

Full description

Bibliographic Details
Main Authors:	Rajive Ganguli, Preston Miller, Rambabu Pothina
Format:	Article
Language:	English
Published:	MDPI AG 2021-07-01
Series:	Minerals
Subjects:	mine safety and health accidents narratives machine learning natural language processing random forest classification
Online Access:	https://www.mdpi.com/2075-163X/11/7/776

id	doaj-76e0743f90e6437098554817071a21b5
record_format	Article
spelling	doaj-76e0743f90e6437098554817071a21b52021-07-23T13:56:01ZengMDPI AGMinerals2075-163X2021-07-011177677610.3390/min11070776Effectiveness of Natural Language Processing Based Machine Learning in Analyzing Incident Narratives at a MineRajive Ganguli0Preston Miller1Rambabu Pothina2Department of Mining Engineering, University of Utah, Salt Lake City, UT 84112, USATeck Red Dog Operations, Anchorage, AK 99503, USADepartment of Mining Engineering, University of Utah, Salt Lake City, UT 84112, USATo achieve the goal of preventing serious injuries and fatalities, it is important for a mine site to analyze site specific mine safety data. The advances in natural language processing (NLP) create an opportunity to develop machine learning (ML) tools to automate analysis of mine health and safety management systems (HSMS) data without requiring experts at every mine site. As a demonstration, nine random forest (RF) models were developed to classify narratives from the Mine Safety and Health Administration (MSHA) database into nine accident types. MSHA accident categories are quite descriptive and are, thus, a proxy for high level understanding of the incidents. A single model developed to classify narratives into a single category was more effective than a single model that classified narratives into different categories. The developed models were then applied to narratives taken from a mine HSMS (non-MSHA), to classify them into MSHA accident categories. About two thirds of the non-MSHA narratives were automatically classified by the RF models. The automatically classified narratives were then evaluated manually. The evaluation showed an accuracy of 96% for automated classifications. The near perfect classification of non-MSHA narratives by MSHA based machine learning models demonstrates that NLP can be a powerful tool to analyze HSMS data.https://www.mdpi.com/2075-163X/11/7/776mine safety and healthaccidentsnarrativesmachine learningnatural language processingrandom forest classification
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Rajive Ganguli Preston Miller Rambabu Pothina
spellingShingle	Rajive Ganguli Preston Miller Rambabu Pothina Effectiveness of Natural Language Processing Based Machine Learning in Analyzing Incident Narratives at a Mine Minerals mine safety and health accidents narratives machine learning natural language processing random forest classification
author_facet	Rajive Ganguli Preston Miller Rambabu Pothina
author_sort	Rajive Ganguli
title	Effectiveness of Natural Language Processing Based Machine Learning in Analyzing Incident Narratives at a Mine
title_short	Effectiveness of Natural Language Processing Based Machine Learning in Analyzing Incident Narratives at a Mine
title_full	Effectiveness of Natural Language Processing Based Machine Learning in Analyzing Incident Narratives at a Mine
title_fullStr	Effectiveness of Natural Language Processing Based Machine Learning in Analyzing Incident Narratives at a Mine
title_full_unstemmed	Effectiveness of Natural Language Processing Based Machine Learning in Analyzing Incident Narratives at a Mine
title_sort	effectiveness of natural language processing based machine learning in analyzing incident narratives at a mine
publisher	MDPI AG
series	Minerals
issn	2075-163X
publishDate	2021-07-01
description	To achieve the goal of preventing serious injuries and fatalities, it is important for a mine site to analyze site specific mine safety data. The advances in natural language processing (NLP) create an opportunity to develop machine learning (ML) tools to automate analysis of mine health and safety management systems (HSMS) data without requiring experts at every mine site. As a demonstration, nine random forest (RF) models were developed to classify narratives from the Mine Safety and Health Administration (MSHA) database into nine accident types. MSHA accident categories are quite descriptive and are, thus, a proxy for high level understanding of the incidents. A single model developed to classify narratives into a single category was more effective than a single model that classified narratives into different categories. The developed models were then applied to narratives taken from a mine HSMS (non-MSHA), to classify them into MSHA accident categories. About two thirds of the non-MSHA narratives were automatically classified by the RF models. The automatically classified narratives were then evaluated manually. The evaluation showed an accuracy of 96% for automated classifications. The near perfect classification of non-MSHA narratives by MSHA based machine learning models demonstrates that NLP can be a powerful tool to analyze HSMS data.
topic	mine safety and health accidents narratives machine learning natural language processing random forest classification
url	https://www.mdpi.com/2075-163X/11/7/776
work_keys_str_mv	AT rajiveganguli effectivenessofnaturallanguageprocessingbasedmachinelearninginanalyzingincidentnarrativesatamine AT prestonmiller effectivenessofnaturallanguageprocessingbasedmachinelearninginanalyzingincidentnarrativesatamine AT rambabupothina effectivenessofnaturallanguageprocessingbasedmachinelearninginanalyzingincidentnarrativesatamine
_version_	1721286893164822528

Effectiveness of Natural Language Processing Based Machine Learning in Analyzing Incident Narratives at a Mine

Similar Items