Distributed Approach for Peptide Identification

A crucial step in protein identification is peptide identification. The Peptide Spectrum Match (PSM) information set is enormous. Hence, it is a time-consuming procedure to work on a single machine. PSMs are situated by a cross connection, a factual score, or a probability that the match between the...

Full description

Bibliographic Details
Main Author:	Vedanbhatla, Naga V K Abhinav
Format:	Others
Published:	TopSCHOLAR® 2015
Subjects:	C-Ranker machine learning Analytical Chemistry Computer Engineering Computer Sciences
Online Access:	http://digitalcommons.wku.edu/theses/1546 http://digitalcommons.wku.edu/cgi/viewcontent.cgi?article=2550&context=theses

id	ndltd-WKU-oai-digitalcommons.wku.edu-theses-2550
record_format	oai_dc
spelling	ndltd-WKU-oai-digitalcommons.wku.edu-theses-25502015-12-12T04:56:35Z Distributed Approach for Peptide Identification Vedanbhatla, Naga V K Abhinav A crucial step in protein identification is peptide identification. The Peptide Spectrum Match (PSM) information set is enormous. Hence, it is a time-consuming procedure to work on a single machine. PSMs are situated by a cross connection, a factual score, or a probability that the match between the trial and speculative is right and original. This procedure takes quite a while to execute. So, there is demand for enhancement of the performance to handle extensive peptide information sets. Development of appropriate distributed frameworks are expected to lessen the processing time. The designed framework uses a peptide handling algorithm named C-Ranker, which takes peptide data as an input then identifies the accurate PSMs. The framework has two steps: Execute the C-Ranker algorithm on servers specified by the user and compare the correct PSM’s data generated via the distributed approach with the normal execution approach of C-Ranker. The objective of this framework is to process expansive peptide datasets utilizing a distributive approach. The nature of the solution calls for parallel execution and hence a decision to implement the same in Java has been taken. The results clearly show that distributed C-Ranker executes in less time as compared to the conventional centralized CRanker application. Around 66.67% of the overall reduction in execution time is shown with this approach. Besides, there is a reduction in the average memory usage with the distributed system running C-Ranker on multiple servers. A great significant benefit that may get overlooked is the fact the distributed CRanker can be used to solve extraordinarily large problems without incurring expenses for a powerful computer or a super computer. Comparison of this approach with An Apache Hadoop Framework for peptide identification with respect to the cost, execution times and flexibility were discussed. 2015-10-01T07:00:00Z text application/pdf http://digitalcommons.wku.edu/theses/1546 http://digitalcommons.wku.edu/cgi/viewcontent.cgi?article=2550&context=theses Masters Theses & Specialist Projects TopSCHOLAR® C-Ranker machine learning Analytical Chemistry Computer Engineering Computer Sciences
collection	NDLTD
format	Others
sources	NDLTD
topic	C-Ranker machine learning Analytical Chemistry Computer Engineering Computer Sciences
spellingShingle	C-Ranker machine learning Analytical Chemistry Computer Engineering Computer Sciences Vedanbhatla, Naga V K Abhinav Distributed Approach for Peptide Identification
description	A crucial step in protein identification is peptide identification. The Peptide Spectrum Match (PSM) information set is enormous. Hence, it is a time-consuming procedure to work on a single machine. PSMs are situated by a cross connection, a factual score, or a probability that the match between the trial and speculative is right and original. This procedure takes quite a while to execute. So, there is demand for enhancement of the performance to handle extensive peptide information sets. Development of appropriate distributed frameworks are expected to lessen the processing time. The designed framework uses a peptide handling algorithm named C-Ranker, which takes peptide data as an input then identifies the accurate PSMs. The framework has two steps: Execute the C-Ranker algorithm on servers specified by the user and compare the correct PSM’s data generated via the distributed approach with the normal execution approach of C-Ranker. The objective of this framework is to process expansive peptide datasets utilizing a distributive approach. The nature of the solution calls for parallel execution and hence a decision to implement the same in Java has been taken. The results clearly show that distributed C-Ranker executes in less time as compared to the conventional centralized CRanker application. Around 66.67% of the overall reduction in execution time is shown with this approach. Besides, there is a reduction in the average memory usage with the distributed system running C-Ranker on multiple servers. A great significant benefit that may get overlooked is the fact the distributed CRanker can be used to solve extraordinarily large problems without incurring expenses for a powerful computer or a super computer. Comparison of this approach with An Apache Hadoop Framework for peptide identification with respect to the cost, execution times and flexibility were discussed.
author	Vedanbhatla, Naga V K Abhinav
author_facet	Vedanbhatla, Naga V K Abhinav
author_sort	Vedanbhatla, Naga V K Abhinav
title	Distributed Approach for Peptide Identification
title_short	Distributed Approach for Peptide Identification
title_full	Distributed Approach for Peptide Identification
title_fullStr	Distributed Approach for Peptide Identification
title_full_unstemmed	Distributed Approach for Peptide Identification
title_sort	distributed approach for peptide identification
publisher	TopSCHOLAR®
publishDate	2015
url	http://digitalcommons.wku.edu/theses/1546 http://digitalcommons.wku.edu/cgi/viewcontent.cgi?article=2550&context=theses
work_keys_str_mv	AT vedanbhatlanagavkabhinav distributedapproachforpeptideidentification
_version_	1718148701131636736

Distributed Approach for Peptide Identification

Similar Items