A Bayesian network approach to feature selection in mass spectrometry data
One of the key goals of current cancer research is the identification of biologic molecules that allow non-invasive detection of existing cancers or cancer precursors. One way to begin this process of biomarker discovery is by using time-of-flight mass spectroscopy to identify proteins or other mole...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
W&M ScholarWorks
2009
|
Subjects: | |
Online Access: | https://scholarworks.wm.edu/etd/1539623543 https://scholarworks.wm.edu/cgi/viewcontent.cgi?article=3334&context=etd |
id |
ndltd-wm.edu-oai-scholarworks.wm.edu-etd-3334 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-wm.edu-oai-scholarworks.wm.edu-etd-33342019-05-16T03:33:48Z A Bayesian network approach to feature selection in mass spectrometry data Kuschner, Karl W. One of the key goals of current cancer research is the identification of biologic molecules that allow non-invasive detection of existing cancers or cancer precursors. One way to begin this process of biomarker discovery is by using time-of-flight mass spectroscopy to identify proteins or other molecules in tissue or serum that correlate to certain cancers. However, there are many difficulties associated with the output of such experiments. The distribution of protein abundances in a population is unknown, the mass spectroscopy measurements have high variability, and high correlations between variables cause problems with popular methods of data mining. to mitigate these issues, Bayesian inductive methods, combined with non-model dependent information theory scoring, are used to find feature sets and build classifiers for mass spectroscopy data from blood serum Such methods show improvement over existing measures, and naturally incorporate measurement uncertainties. Resulting Bayesian network models are applied to three blood serum data sets: one artificially generated, one from a 2004 leukemia study, and another from a 2007 prostate cancer study. Feature sets obtained appear to show sufficient stability under cross-validation to provide not only biomarker candidates but also families of features for further biochemical analysis. 2009-01-01T08:00:00Z text application/pdf https://scholarworks.wm.edu/etd/1539623543 https://scholarworks.wm.edu/cgi/viewcontent.cgi?article=3334&context=etd © The Author Dissertations, Theses, and Masters Projects English W&M ScholarWorks Bioinformatics Mathematics |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
Bioinformatics Mathematics |
spellingShingle |
Bioinformatics Mathematics Kuschner, Karl W. A Bayesian network approach to feature selection in mass spectrometry data |
description |
One of the key goals of current cancer research is the identification of biologic molecules that allow non-invasive detection of existing cancers or cancer precursors. One way to begin this process of biomarker discovery is by using time-of-flight mass spectroscopy to identify proteins or other molecules in tissue or serum that correlate to certain cancers. However, there are many difficulties associated with the output of such experiments. The distribution of protein abundances in a population is unknown, the mass spectroscopy measurements have high variability, and high correlations between variables cause problems with popular methods of data mining. to mitigate these issues, Bayesian inductive methods, combined with non-model dependent information theory scoring, are used to find feature sets and build classifiers for mass spectroscopy data from blood serum Such methods show improvement over existing measures, and naturally incorporate measurement uncertainties. Resulting Bayesian network models are applied to three blood serum data sets: one artificially generated, one from a 2004 leukemia study, and another from a 2007 prostate cancer study. Feature sets obtained appear to show sufficient stability under cross-validation to provide not only biomarker candidates but also families of features for further biochemical analysis. |
author |
Kuschner, Karl W. |
author_facet |
Kuschner, Karl W. |
author_sort |
Kuschner, Karl W. |
title |
A Bayesian network approach to feature selection in mass spectrometry data |
title_short |
A Bayesian network approach to feature selection in mass spectrometry data |
title_full |
A Bayesian network approach to feature selection in mass spectrometry data |
title_fullStr |
A Bayesian network approach to feature selection in mass spectrometry data |
title_full_unstemmed |
A Bayesian network approach to feature selection in mass spectrometry data |
title_sort |
bayesian network approach to feature selection in mass spectrometry data |
publisher |
W&M ScholarWorks |
publishDate |
2009 |
url |
https://scholarworks.wm.edu/etd/1539623543 https://scholarworks.wm.edu/cgi/viewcontent.cgi?article=3334&context=etd |
work_keys_str_mv |
AT kuschnerkarlw abayesiannetworkapproachtofeatureselectioninmassspectrometrydata AT kuschnerkarlw bayesiannetworkapproachtofeatureselectioninmassspectrometrydata |
_version_ |
1719186876546416640 |