Amplifying Domain Expertise in Medical Data Pipelines
Main Author: | |
---|---|
Language: | English |
Published: |
The Ohio State University / OhioLINK
2020
|
Subjects: | |
Online Access: | http://rave.ohiolink.edu/etdc/view?acc_num=osu159823123519264 |
id |
ndltd-OhioLink-oai-etd.ohiolink.edu-osu159823123519264 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-OhioLink-oai-etd.ohiolink.edu-osu1598231235192642021-10-16T05:25:16Z Amplifying Domain Expertise in Medical Data Pipelines Rahman, Protiva Computer Science Microbiology Digitization of medical documents has led to increased availability of data for analysis. This has induced domains to incorporate data-driven decision-making. However, going from data to decision-making involves a pipeline that can be broken into three stages: collection, cleaning, and analysis. The specialized nature of certain datasets, especially in the medical field, requires domain expertise at every pipeline step. Domain experts refer to individuals who are not necessarily trained in computational fields but are experts in the data domain. These experts have different requirements from other end-users. In part one of this dissertation, we motivate the need for a separate class of systems that amplify expertise. To this end, we present a framework for amplifying expertise, which includes summarization, guidance, interaction and acceleration. We demonstrate that expertise can be amplified by employing one or more of the above dimensions at every pipeline stage.Amplification during data collection involves accelerating domain expert data entry by optimizing the form interface to reduce input effort. This is addressed by our system, TRANSFORMER, in part two. TRANSFORMER models the cost of human input as a weighted sum of interactions required to fill the form. It then optimizes the cost by leveraging the schema and data of the form’s database. Our results show that the transformed forms are 50% quicker to complete than the original ones, effectively accelerating expert input.In part three, we address expertise amplification at the data cleaning stage. Filling in unreported values can be tedious if experts are unable to effectively interact with the data. To address this, we present ICARUS, which guides experts by showing informative subsets for interactive updates. It uses the database structure to generalize the expert’s edit to rules, thus accelerating data augmentation. Icarus summarizes the impact of a rule before it is applied. Using ICARUS, experts were, on average, able to fill in 56,000 values in just 148 edits, while in its absence, they required weeks. However, the subjective nature of experts’ rules often requires multiple experts to come to a consensus. This involves removing conflicts and redundancies between rules. The complexity of the rules and data requires informative visual summaries. This is tackled by DELPHI, an interactive decision consolidation system. We conducted a design study to find an effective rule representation for experts. DELPHI summarizes rule relationships and their impact on the data. It allows experts to interactively edit the rule-set and accelerates their task by automatically removing redundant rules.In part four, we address amplification in data analysis through DEEDEE. DEEDEE aids experts in making treatment guidelines. It creates a decision-tree to classify patients based on their antibiotic susceptibility. DEEDEE guides the expert by highlighting interesting nodes. It summarizes the data at each node. It allows linked interaction so that the expert can see correlated attributes. Experts can accept paths in the tree as guideline recommendations, thus accelerating their task.Through case studies and empirical evaluation, we show how applying our framework amplifies domain expertise throughout the data pipeline. 2020 English text The Ohio State University / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=osu159823123519264 http://rave.ohiolink.edu/etdc/view?acc_num=osu159823123519264 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws. |
collection |
NDLTD |
language |
English |
sources |
NDLTD |
topic |
Computer Science Microbiology |
spellingShingle |
Computer Science Microbiology Rahman, Protiva Amplifying Domain Expertise in Medical Data Pipelines |
author |
Rahman, Protiva |
author_facet |
Rahman, Protiva |
author_sort |
Rahman, Protiva |
title |
Amplifying Domain Expertise in Medical Data Pipelines |
title_short |
Amplifying Domain Expertise in Medical Data Pipelines |
title_full |
Amplifying Domain Expertise in Medical Data Pipelines |
title_fullStr |
Amplifying Domain Expertise in Medical Data Pipelines |
title_full_unstemmed |
Amplifying Domain Expertise in Medical Data Pipelines |
title_sort |
amplifying domain expertise in medical data pipelines |
publisher |
The Ohio State University / OhioLINK |
publishDate |
2020 |
url |
http://rave.ohiolink.edu/etdc/view?acc_num=osu159823123519264 |
work_keys_str_mv |
AT rahmanprotiva amplifyingdomainexpertiseinmedicaldatapipelines |
_version_ |
1719489965748912128 |