Appreciation of structured and unstructured content to aid decision making : from web scraping to ontologies and data dictionaries in healthcare

A systematic approach to the extraction of data from disparate data sources is proposed. The World Wide Web is a most diverse dataset; identifying ways in which this large database provides means for data quality verification with concepts such as data lineage and provenance allows to follow the sam...

Full description

Bibliographic Details
Main Author: Michalakidis, Georgios
Other Authors: Krause, Paul J.
Published: University of Surrey 2016
Subjects:
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.698632
id ndltd-bl.uk-oai-ethos.bl.uk-698632
record_format oai_dc
spelling ndltd-bl.uk-oai-ethos.bl.uk-6986322018-05-12T03:25:44ZAppreciation of structured and unstructured content to aid decision making : from web scraping to ontologies and data dictionaries in healthcareMichalakidis, GeorgiosKrause, Paul J.2016A systematic approach to the extraction of data from disparate data sources is proposed. The World Wide Web is a most diverse dataset; identifying ways in which this large database provides means for data quality verification with concepts such as data lineage and provenance allows to follow the same approach as a means to aid decision-making in sensitive domains such as healthcare. Through lessons learned from research in the UK and internationally, we conclude that emphasis on interoperable and model-based support of the data syndication can enhance data quality, an issue still current (American Hospital Association, 2015) and with data barriers in healthcare due to governance concerns. To improve on the above, we start by proposing a system for solution-orientated reporting of errors associated with the extraction of routinely collected clinical data. We then explore key concepts to assess the readiness of data for research and define an ontology-driven approach to create data dictionaries for quality improvement in healthcare. Finally, we apply this research to facilitate the enablement of consistent data recording across a health system to allow for service quality comparisons. Work deriving from this research and built by the author commissioned and aided by the UK NHS, University of Surrey, Green Cross Medical, particularly in creating and testing software systems in real-world scenarios, has facilitated: quality improvement in healthcare data extraction from GP practices in the UK, a state-of-art system for Web-enabling Hospital Episode Statistics (HES) data for dermatology and, finally, an online system designed to enable cancer Multi-Disciplinary Teams (MDTs) to self-assess and receive feedback on how their team performs against the standards set out in ‘The Characteristics of an Effective MDT’ provided by NHS IQ, formerly part of National Cancer Action Team (NCAT), which in 2016 won the Quality in Care Programme’s “Digital Innovation in the Treatment of Cancer” award. Further experimentation shows there is potential for the methods proposed to be applicable in other sectors such as the investment sector (initial investigation has happened through the early stages of this research) but it is suggested that this potential be explored further.610.285University of Surreyhttp://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.698632http://epubs.surrey.ac.uk/812261/Electronic Thesis or Dissertation
collection NDLTD
sources NDLTD
topic 610.285
spellingShingle 610.285
Michalakidis, Georgios
Appreciation of structured and unstructured content to aid decision making : from web scraping to ontologies and data dictionaries in healthcare
description A systematic approach to the extraction of data from disparate data sources is proposed. The World Wide Web is a most diverse dataset; identifying ways in which this large database provides means for data quality verification with concepts such as data lineage and provenance allows to follow the same approach as a means to aid decision-making in sensitive domains such as healthcare. Through lessons learned from research in the UK and internationally, we conclude that emphasis on interoperable and model-based support of the data syndication can enhance data quality, an issue still current (American Hospital Association, 2015) and with data barriers in healthcare due to governance concerns. To improve on the above, we start by proposing a system for solution-orientated reporting of errors associated with the extraction of routinely collected clinical data. We then explore key concepts to assess the readiness of data for research and define an ontology-driven approach to create data dictionaries for quality improvement in healthcare. Finally, we apply this research to facilitate the enablement of consistent data recording across a health system to allow for service quality comparisons. Work deriving from this research and built by the author commissioned and aided by the UK NHS, University of Surrey, Green Cross Medical, particularly in creating and testing software systems in real-world scenarios, has facilitated: quality improvement in healthcare data extraction from GP practices in the UK, a state-of-art system for Web-enabling Hospital Episode Statistics (HES) data for dermatology and, finally, an online system designed to enable cancer Multi-Disciplinary Teams (MDTs) to self-assess and receive feedback on how their team performs against the standards set out in ‘The Characteristics of an Effective MDT’ provided by NHS IQ, formerly part of National Cancer Action Team (NCAT), which in 2016 won the Quality in Care Programme’s “Digital Innovation in the Treatment of Cancer” award. Further experimentation shows there is potential for the methods proposed to be applicable in other sectors such as the investment sector (initial investigation has happened through the early stages of this research) but it is suggested that this potential be explored further.
author2 Krause, Paul J.
author_facet Krause, Paul J.
Michalakidis, Georgios
author Michalakidis, Georgios
author_sort Michalakidis, Georgios
title Appreciation of structured and unstructured content to aid decision making : from web scraping to ontologies and data dictionaries in healthcare
title_short Appreciation of structured and unstructured content to aid decision making : from web scraping to ontologies and data dictionaries in healthcare
title_full Appreciation of structured and unstructured content to aid decision making : from web scraping to ontologies and data dictionaries in healthcare
title_fullStr Appreciation of structured and unstructured content to aid decision making : from web scraping to ontologies and data dictionaries in healthcare
title_full_unstemmed Appreciation of structured and unstructured content to aid decision making : from web scraping to ontologies and data dictionaries in healthcare
title_sort appreciation of structured and unstructured content to aid decision making : from web scraping to ontologies and data dictionaries in healthcare
publisher University of Surrey
publishDate 2016
url http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.698632
work_keys_str_mv AT michalakidisgeorgios appreciationofstructuredandunstructuredcontenttoaiddecisionmakingfromwebscrapingtoontologiesanddatadictionariesinhealthcare
_version_ 1718637520486400000