UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences

With the avalanche of biological sequences in public databases, one of the most challenging problems in computational biology is to predict their biological functions and cellular attributes. Most of the existing prediction algorithms can only handle fixed-length numerical vectors. Therefore, it is...

Full description

Bibliographic Details
Main Authors: Pu-Feng Du, Wei Zhao, Yang-Yang Miao, Le-Yi Wei, Likun Wang
Format: Article
Language:English
Published: MDPI AG 2017-11-01
Series:International Journal of Molecular Sciences
Subjects:
Online Access:https://www.mdpi.com/1422-0067/18/11/2400
id doaj-6139f60dd07a4059963970572085a079
record_format Article
spelling doaj-6139f60dd07a4059963970572085a0792020-11-25T00:53:00ZengMDPI AGInternational Journal of Molecular Sciences1422-00672017-11-011811240010.3390/ijms18112400ijms18112400UltraPse: A Universal and Extensible Software Platform for Representing Biological SequencesPu-Feng Du0Wei Zhao1Yang-Yang Miao2Le-Yi Wei3Likun Wang4School of Computer Science and Technology, Tianjin University, Tianjin 300350, ChinaSchool of Computer Science and Technology, Tianjin University, Tianjin 300350, ChinaSchool of Computer Science and Technology, Tianjin University, Tianjin 300350, ChinaSchool of Computer Science and Technology, Tianjin University, Tianjin 300350, ChinaInstitute of Systems Biomedicine, Beijing Key Laboratory of Tumor Systems Biology, Department of Pathology, School of Basic Medical Sciences, Peking University Health Science Center, Beijing 100191, ChinaWith the avalanche of biological sequences in public databases, one of the most challenging problems in computational biology is to predict their biological functions and cellular attributes. Most of the existing prediction algorithms can only handle fixed-length numerical vectors. Therefore, it is important to be able to represent biological sequences with various lengths using fixed-length numerical vectors. Although several algorithms, as well as software implementations, have been developed to address this problem, these existing programs can only provide a fixed number of representation modes. Every time a new sequence representation mode is developed, a new program will be needed. In this paper, we propose the UltraPse as a universal software platform for this problem. The function of the UltraPse is not only to generate various existing sequence representation modes, but also to simplify all future programming works in developing novel representation modes. The extensibility of UltraPse is particularly enhanced. It allows the users to define their own representation mode, their own physicochemical properties, or even their own types of biological sequences. Moreover, UltraPse is also the fastest software of its kind. The source code package, as well as the executables for both Linux and Windows platforms, can be downloaded from the GitHub repository.https://www.mdpi.com/1422-0067/18/11/2400pseudo-amino acid compositionspseudo-k nucleotide compositionsextensible software
collection DOAJ
language English
format Article
sources DOAJ
author Pu-Feng Du
Wei Zhao
Yang-Yang Miao
Le-Yi Wei
Likun Wang
spellingShingle Pu-Feng Du
Wei Zhao
Yang-Yang Miao
Le-Yi Wei
Likun Wang
UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences
International Journal of Molecular Sciences
pseudo-amino acid compositions
pseudo-k nucleotide compositions
extensible software
author_facet Pu-Feng Du
Wei Zhao
Yang-Yang Miao
Le-Yi Wei
Likun Wang
author_sort Pu-Feng Du
title UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences
title_short UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences
title_full UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences
title_fullStr UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences
title_full_unstemmed UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences
title_sort ultrapse: a universal and extensible software platform for representing biological sequences
publisher MDPI AG
series International Journal of Molecular Sciences
issn 1422-0067
publishDate 2017-11-01
description With the avalanche of biological sequences in public databases, one of the most challenging problems in computational biology is to predict their biological functions and cellular attributes. Most of the existing prediction algorithms can only handle fixed-length numerical vectors. Therefore, it is important to be able to represent biological sequences with various lengths using fixed-length numerical vectors. Although several algorithms, as well as software implementations, have been developed to address this problem, these existing programs can only provide a fixed number of representation modes. Every time a new sequence representation mode is developed, a new program will be needed. In this paper, we propose the UltraPse as a universal software platform for this problem. The function of the UltraPse is not only to generate various existing sequence representation modes, but also to simplify all future programming works in developing novel representation modes. The extensibility of UltraPse is particularly enhanced. It allows the users to define their own representation mode, their own physicochemical properties, or even their own types of biological sequences. Moreover, UltraPse is also the fastest software of its kind. The source code package, as well as the executables for both Linux and Windows platforms, can be downloaded from the GitHub repository.
topic pseudo-amino acid compositions
pseudo-k nucleotide compositions
extensible software
url https://www.mdpi.com/1422-0067/18/11/2400
work_keys_str_mv AT pufengdu ultrapseauniversalandextensiblesoftwareplatformforrepresentingbiologicalsequences
AT weizhao ultrapseauniversalandextensiblesoftwareplatformforrepresentingbiologicalsequences
AT yangyangmiao ultrapseauniversalandextensiblesoftwareplatformforrepresentingbiologicalsequences
AT leyiwei ultrapseauniversalandextensiblesoftwareplatformforrepresentingbiologicalsequences
AT likunwang ultrapseauniversalandextensiblesoftwareplatformforrepresentingbiologicalsequences
_version_ 1725239661582876672