Finding the Most Predictive Data Source in Biological Data

Classification can be used to predict unknown functions of proteins by using known function information. In some cases, multiple sets of data are available for classification where prediction is only part of the problem, and knowing the most reliable source for prediction is also relevant. Our goal...

Full description

Bibliographic Details
Main Author:	Chakraborty, Ushashi
Format:	Others
Published:	North Dakota State University 2017
Subjects:	Data mining Cell cycle Yeast Science > Methodology
Online Access:	https://hdl.handle.net/10365/26567

id	ndltd-ndsu.edu-oai-library.ndsu.edu-10365-26567
record_format	oai_dc
spelling	ndltd-ndsu.edu-oai-library.ndsu.edu-10365-265672021-09-28T17:11:34Z Finding the Most Predictive Data Source in Biological Data Chakraborty, Ushashi Data mining Cell cycle Yeast Science -- Methodology Classification can be used to predict unknown functions of proteins by using known function information. In some cases, multiple sets of data are available for classification where prediction is only part of the problem, and knowing the most reliable source for prediction is also relevant. Our goal is to develop classification techniques to find the most predictive of the multiple data sets that we have in this project. We use existing classification techniques like linear and quadratic classifications and statistical relevance measures like posterior and log p analysis in our proposed algorithm, which is able to find the data set that is expected to give the best prediction. The proposed algorithm is used on experimental readings during cell cycle of yeast and it predicts the genes that participate in cell-cycle regulation and the type of experiment that provides evidence of cell cycle involvement for any particular gene. 2017-10-12T14:35:30Z 2017-10-12T14:35:30Z 2013 text/thesis https://hdl.handle.net/10365/26567 NDSU Policy 190.6.2 https://www.ndsu.edu/fileadmin/policy/190.pdf application/pdf North Dakota State University
collection	NDLTD
format	Others
sources	NDLTD
topic	Data mining Cell cycle Yeast Science -- Methodology
spellingShingle	Data mining Cell cycle Yeast Science -- Methodology Chakraborty, Ushashi Finding the Most Predictive Data Source in Biological Data
description	Classification can be used to predict unknown functions of proteins by using known function information. In some cases, multiple sets of data are available for classification where prediction is only part of the problem, and knowing the most reliable source for prediction is also relevant. Our goal is to develop classification techniques to find the most predictive of the multiple data sets that we have in this project. We use existing classification techniques like linear and quadratic classifications and statistical relevance measures like posterior and log p analysis in our proposed algorithm, which is able to find the data set that is expected to give the best prediction. The proposed algorithm is used on experimental readings during cell cycle of yeast and it predicts the genes that participate in cell-cycle regulation and the type of experiment that provides evidence of cell cycle involvement for any particular gene.
author	Chakraborty, Ushashi
author_facet	Chakraborty, Ushashi
author_sort	Chakraborty, Ushashi
title	Finding the Most Predictive Data Source in Biological Data
title_short	Finding the Most Predictive Data Source in Biological Data
title_full	Finding the Most Predictive Data Source in Biological Data
title_fullStr	Finding the Most Predictive Data Source in Biological Data
title_full_unstemmed	Finding the Most Predictive Data Source in Biological Data
title_sort	finding the most predictive data source in biological data
publisher	North Dakota State University
publishDate	2017
url	https://hdl.handle.net/10365/26567
work_keys_str_mv	AT chakrabortyushashi findingthemostpredictivedatasourceinbiologicaldata
_version_	1719485553492099072

Finding the Most Predictive Data Source in Biological Data

Similar Items