Detecting clinically relevant new information in clinical notes across specialties and settings

Abstract Background Automated methods for identifying clinically relevant new versus redundant information in electronic health record (EHR) clinical notes is useful for clinicians and researchers involved in patient care and clinical research, respectively. We evaluated methods to automatically ide...

Full description

Bibliographic Details
Main Authors: Rui Zhang, Serguei V. S. Pakhomov, Elliot G. Arsoniadis, Janet T. Lee, Yan Wang, Genevieve B. Melton
Format: Article
Language:English
Published: BMC 2017-07-01
Series:BMC Medical Informatics and Decision Making
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12911-017-0464-y
id doaj-5572d86515bc4cd5a448071ee826bfc8
record_format Article
spelling doaj-5572d86515bc4cd5a448071ee826bfc82020-11-24T23:12:20ZengBMCBMC Medical Informatics and Decision Making1472-69472017-07-0117S2152210.1186/s12911-017-0464-yDetecting clinically relevant new information in clinical notes across specialties and settingsRui Zhang0Serguei V. S. Pakhomov1Elliot G. Arsoniadis2Janet T. Lee3Yan Wang4Genevieve B. Melton5Institute for Health Informatics, University of MinnesotaInstitute for Health Informatics, University of MinnesotaInstitute for Health Informatics, University of MinnesotaDepartment of Surgery, University of MinnesotaInstitute for Health Informatics, University of MinnesotaInstitute for Health Informatics, University of MinnesotaAbstract Background Automated methods for identifying clinically relevant new versus redundant information in electronic health record (EHR) clinical notes is useful for clinicians and researchers involved in patient care and clinical research, respectively. We evaluated methods to automatically identify clinically relevant new information in clinical notes, and compared the quantity of redundant information across specialties and clinical settings. Methods Statistical language models augmented with semantic similarity measures were evaluated as a means to detect and quantify clinically relevant new and redundant information over longitudinal clinical notes for a given patient. A corpus of 591 progress notes over 40 inpatient admissions was annotated for new information longitudinally by physicians to generate a reference standard. Note redundancy between various specialties was evaluated on 71,021 outpatient notes and 64,695 inpatient notes from 500 solid organ transplant patients (April 2015 through August 2015). Results Our best method achieved at best performance of 0.87 recall, 0.62 precision, and 0.72 F-measure. Addition of semantic similarity metrics compared to baseline improved recall but otherwise resulted in similar performance. While outpatient and inpatient notes had relatively similar levels of high redundancy (61% and 68%, respectively), redundancy differed by author specialty with mean redundancy of 75%, 66%, 57%, and 55% observed in pediatric, internal medicine, psychiatry and surgical notes, respectively. Conclusions Automated techniques with statistical language models for detecting redundant versus clinically relevant new information in clinical notes do not improve with the addition of semantic similarity measures. While levels of redundancy seem relatively similar in the inpatient and ambulatory settings in the Fairview Health Services, clinical note redundancy appears to vary significantly with different medical specialties.http://link.springer.com/article/10.1186/s12911-017-0464-yNatural language processingElectronic health recordsStatistical language modelsSemantic similarityNew informationRedundancy
collection DOAJ
language English
format Article
sources DOAJ
author Rui Zhang
Serguei V. S. Pakhomov
Elliot G. Arsoniadis
Janet T. Lee
Yan Wang
Genevieve B. Melton
spellingShingle Rui Zhang
Serguei V. S. Pakhomov
Elliot G. Arsoniadis
Janet T. Lee
Yan Wang
Genevieve B. Melton
Detecting clinically relevant new information in clinical notes across specialties and settings
BMC Medical Informatics and Decision Making
Natural language processing
Electronic health records
Statistical language models
Semantic similarity
New information
Redundancy
author_facet Rui Zhang
Serguei V. S. Pakhomov
Elliot G. Arsoniadis
Janet T. Lee
Yan Wang
Genevieve B. Melton
author_sort Rui Zhang
title Detecting clinically relevant new information in clinical notes across specialties and settings
title_short Detecting clinically relevant new information in clinical notes across specialties and settings
title_full Detecting clinically relevant new information in clinical notes across specialties and settings
title_fullStr Detecting clinically relevant new information in clinical notes across specialties and settings
title_full_unstemmed Detecting clinically relevant new information in clinical notes across specialties and settings
title_sort detecting clinically relevant new information in clinical notes across specialties and settings
publisher BMC
series BMC Medical Informatics and Decision Making
issn 1472-6947
publishDate 2017-07-01
description Abstract Background Automated methods for identifying clinically relevant new versus redundant information in electronic health record (EHR) clinical notes is useful for clinicians and researchers involved in patient care and clinical research, respectively. We evaluated methods to automatically identify clinically relevant new information in clinical notes, and compared the quantity of redundant information across specialties and clinical settings. Methods Statistical language models augmented with semantic similarity measures were evaluated as a means to detect and quantify clinically relevant new and redundant information over longitudinal clinical notes for a given patient. A corpus of 591 progress notes over 40 inpatient admissions was annotated for new information longitudinally by physicians to generate a reference standard. Note redundancy between various specialties was evaluated on 71,021 outpatient notes and 64,695 inpatient notes from 500 solid organ transplant patients (April 2015 through August 2015). Results Our best method achieved at best performance of 0.87 recall, 0.62 precision, and 0.72 F-measure. Addition of semantic similarity metrics compared to baseline improved recall but otherwise resulted in similar performance. While outpatient and inpatient notes had relatively similar levels of high redundancy (61% and 68%, respectively), redundancy differed by author specialty with mean redundancy of 75%, 66%, 57%, and 55% observed in pediatric, internal medicine, psychiatry and surgical notes, respectively. Conclusions Automated techniques with statistical language models for detecting redundant versus clinically relevant new information in clinical notes do not improve with the addition of semantic similarity measures. While levels of redundancy seem relatively similar in the inpatient and ambulatory settings in the Fairview Health Services, clinical note redundancy appears to vary significantly with different medical specialties.
topic Natural language processing
Electronic health records
Statistical language models
Semantic similarity
New information
Redundancy
url http://link.springer.com/article/10.1186/s12911-017-0464-y
work_keys_str_mv AT ruizhang detectingclinicallyrelevantnewinformationinclinicalnotesacrossspecialtiesandsettings
AT sergueivspakhomov detectingclinicallyrelevantnewinformationinclinicalnotesacrossspecialtiesandsettings
AT elliotgarsoniadis detectingclinicallyrelevantnewinformationinclinicalnotesacrossspecialtiesandsettings
AT janettlee detectingclinicallyrelevantnewinformationinclinicalnotesacrossspecialtiesandsettings
AT yanwang detectingclinicallyrelevantnewinformationinclinicalnotesacrossspecialtiesandsettings
AT genevievebmelton detectingclinicallyrelevantnewinformationinclinicalnotesacrossspecialtiesandsettings
_version_ 1725601393238081536