Model-based Biomarker Detection and Systematic Analysis in Translational Science

This dissertation is concerned with the application of mathematical modeling and statistical signal processing into the rapidly expanding fields of proteomics and genomics. The research is guided by a translational goal which drives the problem formalization and experimental design, and leads to opt...

Full description

Bibliographic Details
Main Author: Sun, Youting
Other Authors: Dougherty, Edward R.
Format: Others
Language:en_US
Published: 2012
Subjects:
Online Access:http://hdl.handle.net/1969.1/ETD-TAMU-2012-05-10914
id ndltd-tamu.edu-oai-repository.tamu.edu-1969.1-ETD-TAMU-2012-05-10914
record_format oai_dc
spelling ndltd-tamu.edu-oai-repository.tamu.edu-1969.1-ETD-TAMU-2012-05-109142013-01-08T10:43:58ZModel-based Biomarker Detection and Systematic Analysis in Translational ScienceSun, YoutingMathematical modelingbiomarker detectiongenetic and proteomic signal processingThis dissertation is concerned with the application of mathematical modeling and statistical signal processing into the rapidly expanding fields of proteomics and genomics. The research is guided by a translational goal which drives the problem formalization and experimental design, and leads to optimization, prediction and control of the underlying system. The dissertation is comprised of three interconnected subjects. In the first part of the dissertation, two Bayesian peptide detection algorithms are proposed to optimize the feature extraction step, which is the most fundamental step in mass spectrometry-based proteomics. The algorithms are designed to tackle data processing challenges that are not satisfactorily addressed by existing methods. In contrast to most existing methods, the proposed algorithms perform deisotoping and deconvolution of mass spectra simultaneously, which enables better identification of weak peptide signals. Unlike greedy template-matching algorithms, the proposed methods have the capability to handle complex spectra where features overlap. The proposed methods achieve better sensitivity and accuracy compared to many popular software packages such as msInspect. In the second part of the dissertation, we consider modeling and assessing the entire mass spectrometry-based proteomic data analysis pipeline. Different modules are identified and analyzed, resulting in a framework that captures key factors in system performance. The effects of various model parameters on protein identification rates and quantification errors, differential expression results, and classification performance are examined. The proposed pipeline model can be used to aid experimental design, pinpoint critical bottlenecks, optimize the work flow, and predict biomarker discovery results. Finally, the same system methodology is extended to analyze the work flow in DNA microarray experiments. A model-based approach is developed to explore the relationship among microarray data properties, missing value imputation, and sample classification in a complicated data analysis pipeline. The situations when it is suitable to apply missing value imputation are identified and recommendations regarding imputation are provided. In addition, a missing value rate-related peaking phenomenon is uncovered.Dougherty, Edward R.Braga-Neto, Ulisses2012-07-16T15:58:04Z2012-07-16T20:28:25Z2012-07-16T15:58:04Z2012-052012-07-16May 2012thesistextapplication/pdfhttp://hdl.handle.net/1969.1/ETD-TAMU-2012-05-10914en_US
collection NDLTD
language en_US
format Others
sources NDLTD
topic Mathematical modeling
biomarker detection
genetic and proteomic signal processing
spellingShingle Mathematical modeling
biomarker detection
genetic and proteomic signal processing
Sun, Youting
Model-based Biomarker Detection and Systematic Analysis in Translational Science
description This dissertation is concerned with the application of mathematical modeling and statistical signal processing into the rapidly expanding fields of proteomics and genomics. The research is guided by a translational goal which drives the problem formalization and experimental design, and leads to optimization, prediction and control of the underlying system. The dissertation is comprised of three interconnected subjects. In the first part of the dissertation, two Bayesian peptide detection algorithms are proposed to optimize the feature extraction step, which is the most fundamental step in mass spectrometry-based proteomics. The algorithms are designed to tackle data processing challenges that are not satisfactorily addressed by existing methods. In contrast to most existing methods, the proposed algorithms perform deisotoping and deconvolution of mass spectra simultaneously, which enables better identification of weak peptide signals. Unlike greedy template-matching algorithms, the proposed methods have the capability to handle complex spectra where features overlap. The proposed methods achieve better sensitivity and accuracy compared to many popular software packages such as msInspect. In the second part of the dissertation, we consider modeling and assessing the entire mass spectrometry-based proteomic data analysis pipeline. Different modules are identified and analyzed, resulting in a framework that captures key factors in system performance. The effects of various model parameters on protein identification rates and quantification errors, differential expression results, and classification performance are examined. The proposed pipeline model can be used to aid experimental design, pinpoint critical bottlenecks, optimize the work flow, and predict biomarker discovery results. Finally, the same system methodology is extended to analyze the work flow in DNA microarray experiments. A model-based approach is developed to explore the relationship among microarray data properties, missing value imputation, and sample classification in a complicated data analysis pipeline. The situations when it is suitable to apply missing value imputation are identified and recommendations regarding imputation are provided. In addition, a missing value rate-related peaking phenomenon is uncovered.
author2 Dougherty, Edward R.
author_facet Dougherty, Edward R.
Sun, Youting
author Sun, Youting
author_sort Sun, Youting
title Model-based Biomarker Detection and Systematic Analysis in Translational Science
title_short Model-based Biomarker Detection and Systematic Analysis in Translational Science
title_full Model-based Biomarker Detection and Systematic Analysis in Translational Science
title_fullStr Model-based Biomarker Detection and Systematic Analysis in Translational Science
title_full_unstemmed Model-based Biomarker Detection and Systematic Analysis in Translational Science
title_sort model-based biomarker detection and systematic analysis in translational science
publishDate 2012
url http://hdl.handle.net/1969.1/ETD-TAMU-2012-05-10914
work_keys_str_mv AT sunyouting modelbasedbiomarkerdetectionandsystematicanalysisintranslationalscience
_version_ 1716505515341643776