Text Classification in Clinical Practice Guidelines Using Machine-Learning Assisted Pattern-Based Approach

Clinical Practice Guidelines (CPGs) aim to optimize patient care by assisting physicians during the decision-making process. However, guideline adherence is highly affected by its unstructured format and aggregation of background information with disease-specific information. The objective of our st...

Full description

Bibliographic Details
Main Authors: Musarrat Hussain, Jamil Hussain, Taqdir Ali, Syed Imran Ali, Hafiz Syed Muhammad Bilal, Sungyoung Lee, Taechoong Chung
Format: Article
Language:English
Published: MDPI AG 2021-04-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/11/8/3296
id doaj-8de46fd16cbe49b08dc8f5b820d09bfd
record_format Article
spelling doaj-8de46fd16cbe49b08dc8f5b820d09bfd2021-04-07T23:00:04ZengMDPI AGApplied Sciences2076-34172021-04-01113296329610.3390/app11083296Text Classification in Clinical Practice Guidelines Using Machine-Learning Assisted Pattern-Based ApproachMusarrat Hussain0Jamil Hussain1Taqdir Ali2Syed Imran Ali3Hafiz Syed Muhammad Bilal4Sungyoung Lee5Taechoong Chung6Department of Computer Science and Engineering, Kyung Hee University, Global Campus, 1732, Deogyeong-daero, Giheung-gu, Yongin-si, Gyeonggi-do 17104, KoreaDepartment of Data Science, Sejong University, Sejong 30019, KoreaDepartment of Computer Science and Engineering, Kyung Hee University, Global Campus, 1732, Deogyeong-daero, Giheung-gu, Yongin-si, Gyeonggi-do 17104, KoreaDepartment of Computer Science and Engineering, Kyung Hee University, Global Campus, 1732, Deogyeong-daero, Giheung-gu, Yongin-si, Gyeonggi-do 17104, KoreaDepartment of Computer Science and Engineering, Kyung Hee University, Global Campus, 1732, Deogyeong-daero, Giheung-gu, Yongin-si, Gyeonggi-do 17104, KoreaDepartment of Computer Science and Engineering, Kyung Hee University, Global Campus, 1732, Deogyeong-daero, Giheung-gu, Yongin-si, Gyeonggi-do 17104, KoreaDepartment of Computer Science and Engineering, Kyung Hee University, Global Campus, 1732, Deogyeong-daero, Giheung-gu, Yongin-si, Gyeonggi-do 17104, KoreaClinical Practice Guidelines (CPGs) aim to optimize patient care by assisting physicians during the decision-making process. However, guideline adherence is highly affected by its unstructured format and aggregation of background information with disease-specific information. The objective of our study is to extract disease-specific information from CPG for enhancing its adherence ratio. In this research, we propose a semi-automatic mechanism for extracting disease-specific information from CPGs using pattern-matching techniques. We apply supervised and unsupervised machine-learning algorithms on CPG to extract a list of salient terms contributing to distinguishing recommendation sentences (RS) from non-recommendation sentences (NRS). Simultaneously, a group of experts also analyzes the same CPG and extract the initial patterns “Heuristic Patterns” using a group decision-making method, nominal group technique (NGT). We provide the list of salient terms to the experts and ask them to refine their extracted patterns. The experts refine patterns considering the provided salient terms. The extracted heuristic patterns depend on specific terms and suffer from the specialization problem due to synonymy and polysemy. Therefore, we generalize the heuristic patterns to part-of-speech (POS) patterns and unified medical language system (UMLS) patterns, which make the proposed method generalize for all types of CPGs. We evaluated the initial extracted patterns on asthma, rhinosinusitis, and hypertension guidelines with the accuracy of 76.92%, 84.63%, and 89.16%, respectively. The accuracy increased to 78.89%, 85.32%, and 92.07% with refined machine-learning assistive patterns, respectively. Our system assists physicians by locating disease-specific information in the CPGs, which enhances the physicians’ performance and reduces CPG processing time. Additionally, it is beneficial in CPGs content annotation.https://www.mdpi.com/2076-3417/11/8/3296recommendation statements identificationguideline processingpattern extractioninformation extractionclinical text mining
collection DOAJ
language English
format Article
sources DOAJ
author Musarrat Hussain
Jamil Hussain
Taqdir Ali
Syed Imran Ali
Hafiz Syed Muhammad Bilal
Sungyoung Lee
Taechoong Chung
spellingShingle Musarrat Hussain
Jamil Hussain
Taqdir Ali
Syed Imran Ali
Hafiz Syed Muhammad Bilal
Sungyoung Lee
Taechoong Chung
Text Classification in Clinical Practice Guidelines Using Machine-Learning Assisted Pattern-Based Approach
Applied Sciences
recommendation statements identification
guideline processing
pattern extraction
information extraction
clinical text mining
author_facet Musarrat Hussain
Jamil Hussain
Taqdir Ali
Syed Imran Ali
Hafiz Syed Muhammad Bilal
Sungyoung Lee
Taechoong Chung
author_sort Musarrat Hussain
title Text Classification in Clinical Practice Guidelines Using Machine-Learning Assisted Pattern-Based Approach
title_short Text Classification in Clinical Practice Guidelines Using Machine-Learning Assisted Pattern-Based Approach
title_full Text Classification in Clinical Practice Guidelines Using Machine-Learning Assisted Pattern-Based Approach
title_fullStr Text Classification in Clinical Practice Guidelines Using Machine-Learning Assisted Pattern-Based Approach
title_full_unstemmed Text Classification in Clinical Practice Guidelines Using Machine-Learning Assisted Pattern-Based Approach
title_sort text classification in clinical practice guidelines using machine-learning assisted pattern-based approach
publisher MDPI AG
series Applied Sciences
issn 2076-3417
publishDate 2021-04-01
description Clinical Practice Guidelines (CPGs) aim to optimize patient care by assisting physicians during the decision-making process. However, guideline adherence is highly affected by its unstructured format and aggregation of background information with disease-specific information. The objective of our study is to extract disease-specific information from CPG for enhancing its adherence ratio. In this research, we propose a semi-automatic mechanism for extracting disease-specific information from CPGs using pattern-matching techniques. We apply supervised and unsupervised machine-learning algorithms on CPG to extract a list of salient terms contributing to distinguishing recommendation sentences (RS) from non-recommendation sentences (NRS). Simultaneously, a group of experts also analyzes the same CPG and extract the initial patterns “Heuristic Patterns” using a group decision-making method, nominal group technique (NGT). We provide the list of salient terms to the experts and ask them to refine their extracted patterns. The experts refine patterns considering the provided salient terms. The extracted heuristic patterns depend on specific terms and suffer from the specialization problem due to synonymy and polysemy. Therefore, we generalize the heuristic patterns to part-of-speech (POS) patterns and unified medical language system (UMLS) patterns, which make the proposed method generalize for all types of CPGs. We evaluated the initial extracted patterns on asthma, rhinosinusitis, and hypertension guidelines with the accuracy of 76.92%, 84.63%, and 89.16%, respectively. The accuracy increased to 78.89%, 85.32%, and 92.07% with refined machine-learning assistive patterns, respectively. Our system assists physicians by locating disease-specific information in the CPGs, which enhances the physicians’ performance and reduces CPG processing time. Additionally, it is beneficial in CPGs content annotation.
topic recommendation statements identification
guideline processing
pattern extraction
information extraction
clinical text mining
url https://www.mdpi.com/2076-3417/11/8/3296
work_keys_str_mv AT musarrathussain textclassificationinclinicalpracticeguidelinesusingmachinelearningassistedpatternbasedapproach
AT jamilhussain textclassificationinclinicalpracticeguidelinesusingmachinelearningassistedpatternbasedapproach
AT taqdirali textclassificationinclinicalpracticeguidelinesusingmachinelearningassistedpatternbasedapproach
AT syedimranali textclassificationinclinicalpracticeguidelinesusingmachinelearningassistedpatternbasedapproach
AT hafizsyedmuhammadbilal textclassificationinclinicalpracticeguidelinesusingmachinelearningassistedpatternbasedapproach
AT sungyounglee textclassificationinclinicalpracticeguidelinesusingmachinelearningassistedpatternbasedapproach
AT taechoongchung textclassificationinclinicalpracticeguidelinesusingmachinelearningassistedpatternbasedapproach
_version_ 1721535867493810176