Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system
<p>Abstract</p> <p>Background</p> <p>The text descriptions in electronic medical records are a rich source of information. We have developed a Health Information Text Extraction (HITEx) tool and used it to extract key findings for a research study on airways disease.<...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2006-07-01
|
Series: | BMC Medical Informatics and Decision Making |
Online Access: | http://www.biomedcentral.com/1472-6947/6/30 |
id |
doaj-a9ff2298177240d290173f2eb25c2318 |
---|---|
record_format |
Article |
spelling |
doaj-a9ff2298177240d290173f2eb25c23182020-11-24T21:53:58ZengBMCBMC Medical Informatics and Decision Making1472-69472006-07-01613010.1186/1472-6947-6-30Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing systemSordo MargaritaWeiss ScottGoryachev SergeyZeng Qing TMurphy Shawn NLazarus Ross<p>Abstract</p> <p>Background</p> <p>The text descriptions in electronic medical records are a rich source of information. We have developed a Health Information Text Extraction (HITEx) tool and used it to extract key findings for a research study on airways disease.</p> <p>Methods</p> <p>The principal diagnosis, co-morbidity and smoking status extracted by HITEx from a set of 150 discharge summaries were compared to an expert-generated gold standard.</p> <p>Results</p> <p>The accuracy of HITEx was 82% for principal diagnosis, 87% for co-morbidity, and 90% for smoking status extraction, when cases labeled "Insufficient Data" by the gold standard were excluded.</p> <p>Conclusion</p> <p>We consider the results promising, given the complexity of the discharge summaries and the extraction tasks.</p> http://www.biomedcentral.com/1472-6947/6/30 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Sordo Margarita Weiss Scott Goryachev Sergey Zeng Qing T Murphy Shawn N Lazarus Ross |
spellingShingle |
Sordo Margarita Weiss Scott Goryachev Sergey Zeng Qing T Murphy Shawn N Lazarus Ross Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system BMC Medical Informatics and Decision Making |
author_facet |
Sordo Margarita Weiss Scott Goryachev Sergey Zeng Qing T Murphy Shawn N Lazarus Ross |
author_sort |
Sordo Margarita |
title |
Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system |
title_short |
Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system |
title_full |
Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system |
title_fullStr |
Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system |
title_full_unstemmed |
Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system |
title_sort |
extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system |
publisher |
BMC |
series |
BMC Medical Informatics and Decision Making |
issn |
1472-6947 |
publishDate |
2006-07-01 |
description |
<p>Abstract</p> <p>Background</p> <p>The text descriptions in electronic medical records are a rich source of information. We have developed a Health Information Text Extraction (HITEx) tool and used it to extract key findings for a research study on airways disease.</p> <p>Methods</p> <p>The principal diagnosis, co-morbidity and smoking status extracted by HITEx from a set of 150 discharge summaries were compared to an expert-generated gold standard.</p> <p>Results</p> <p>The accuracy of HITEx was 82% for principal diagnosis, 87% for co-morbidity, and 90% for smoking status extraction, when cases labeled "Insufficient Data" by the gold standard were excluded.</p> <p>Conclusion</p> <p>We consider the results promising, given the complexity of the discharge summaries and the extraction tasks.</p> |
url |
http://www.biomedcentral.com/1472-6947/6/30 |
work_keys_str_mv |
AT sordomargarita extractingprincipaldiagnosiscomorbidityandsmokingstatusforasthmaresearchevaluationofanaturallanguageprocessingsystem AT weissscott extractingprincipaldiagnosiscomorbidityandsmokingstatusforasthmaresearchevaluationofanaturallanguageprocessingsystem AT goryachevsergey extractingprincipaldiagnosiscomorbidityandsmokingstatusforasthmaresearchevaluationofanaturallanguageprocessingsystem AT zengqingt extractingprincipaldiagnosiscomorbidityandsmokingstatusforasthmaresearchevaluationofanaturallanguageprocessingsystem AT murphyshawnn extractingprincipaldiagnosiscomorbidityandsmokingstatusforasthmaresearchevaluationofanaturallanguageprocessingsystem AT lazarusross extractingprincipaldiagnosiscomorbidityandsmokingstatusforasthmaresearchevaluationofanaturallanguageprocessingsystem |
_version_ |
1725869812544962560 |