Probabilistic Latent Semantic Analysis Applied to Whole Bacterial Genomes Identifies Common Genomic Features

The spread of drug resistance amongst clinically-important bacteria is a serious, and growing, problem [1]. However, the analysis of entire genomes requires considerable computational effort, usually including the assembly of the genome and subsequent identification of genes known to be important in...

Full description

Bibliographic Details
Main Authors:	Rusakovica J., Hallinan J., Wipat A., Zuliani P.
Format:	Article
Language:	English
Published:	De Gruyter 2014-06-01
Series:	Journal of Integrative Bioinformatics
Online Access:	https://doi.org/10.1515/jib-2014-243

id	doaj-70ce5cbb110c42bab4dc81f80e1aa040
record_format	Article
spelling	doaj-70ce5cbb110c42bab4dc81f80e1aa0402021-09-06T19:40:31ZengDe GruyterJournal of Integrative Bioinformatics1613-45162014-06-011129310510.1515/jib-2014-243jib-2014-243Probabilistic Latent Semantic Analysis Applied to Whole Bacterial Genomes Identifies Common Genomic FeaturesRusakovica J.0Hallinan J.1Wipat A.2Zuliani P.3School of Computing Science, and Centre for Synthetic Biology and Bioexploitation, Newcastle University, Newcastle upon Tyne, NE1 7RU, United Kingdom of Great Britain and Northern IrelandSchool of Computing Science, and Centre for Synthetic Biology and Bioexploitation, Newcastle University, Newcastle upon Tyne, NE1 7RU, United Kingdom of Great Britain and Northern IrelandSchool of Computing Science, and Centre for Synthetic Biology and Bioexploitation, Newcastle University, Newcastle upon Tyne, NE1 7RU, United Kingdom of Great Britain and Northern IrelandSchool of Computing Science, and Centre for Synthetic Biology and Bioexploitation, Newcastle University, Newcastle upon Tyne, NE1 7RU, United Kingdom of Great Britain and Northern IrelandThe spread of drug resistance amongst clinically-important bacteria is a serious, and growing, problem [1]. However, the analysis of entire genomes requires considerable computational effort, usually including the assembly of the genome and subsequent identification of genes known to be important in pathology. An alternative approach is to use computational algorithms to identify genomic differences between pathogenic and non-pathogenic bacteria, even without knowing the biological meaning of those differences. To overcome this problem, a range of techniques for dimensionality reduction have been developed. One such approach is known as latent-variable models [2]. In latent-variable models dimensionality reduction is achieved by representing a high-dimensional data by a few hidden or latent variables, which are not directly observed but inferred from the observed variables present in the model. Probabilistic Latent Semantic Indexing (PLSA) is an extention of LSA [3]. PLSA is based on a mixture decomposition derived from a latent class model. The main objective of the algorithm, as in LSA, is to represent high-dimensional co-occurrence information in a lower-dimensional way in order to discover the hidden semantic structure of the data using a probabilistic framework.https://doi.org/10.1515/jib-2014-243
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Rusakovica J. Hallinan J. Wipat A. Zuliani P.
spellingShingle	Rusakovica J. Hallinan J. Wipat A. Zuliani P. Probabilistic Latent Semantic Analysis Applied to Whole Bacterial Genomes Identifies Common Genomic Features Journal of Integrative Bioinformatics
author_facet	Rusakovica J. Hallinan J. Wipat A. Zuliani P.
author_sort	Rusakovica J.
title	Probabilistic Latent Semantic Analysis Applied to Whole Bacterial Genomes Identifies Common Genomic Features
title_short	Probabilistic Latent Semantic Analysis Applied to Whole Bacterial Genomes Identifies Common Genomic Features
title_full	Probabilistic Latent Semantic Analysis Applied to Whole Bacterial Genomes Identifies Common Genomic Features
title_fullStr	Probabilistic Latent Semantic Analysis Applied to Whole Bacterial Genomes Identifies Common Genomic Features
title_full_unstemmed	Probabilistic Latent Semantic Analysis Applied to Whole Bacterial Genomes Identifies Common Genomic Features
title_sort	probabilistic latent semantic analysis applied to whole bacterial genomes identifies common genomic features
publisher	De Gruyter
series	Journal of Integrative Bioinformatics
issn	1613-4516
publishDate	2014-06-01
description	The spread of drug resistance amongst clinically-important bacteria is a serious, and growing, problem [1]. However, the analysis of entire genomes requires considerable computational effort, usually including the assembly of the genome and subsequent identification of genes known to be important in pathology. An alternative approach is to use computational algorithms to identify genomic differences between pathogenic and non-pathogenic bacteria, even without knowing the biological meaning of those differences. To overcome this problem, a range of techniques for dimensionality reduction have been developed. One such approach is known as latent-variable models [2]. In latent-variable models dimensionality reduction is achieved by representing a high-dimensional data by a few hidden or latent variables, which are not directly observed but inferred from the observed variables present in the model. Probabilistic Latent Semantic Indexing (PLSA) is an extention of LSA [3]. PLSA is based on a mixture decomposition derived from a latent class model. The main objective of the algorithm, as in LSA, is to represent high-dimensional co-occurrence information in a lower-dimensional way in order to discover the hidden semantic structure of the data using a probabilistic framework.
url	https://doi.org/10.1515/jib-2014-243
work_keys_str_mv	AT rusakovicaj probabilisticlatentsemanticanalysisappliedtowholebacterialgenomesidentifiescommongenomicfeatures AT hallinanj probabilisticlatentsemanticanalysisappliedtowholebacterialgenomesidentifiescommongenomicfeatures AT wipata probabilisticlatentsemanticanalysisappliedtowholebacterialgenomesidentifiescommongenomicfeatures AT zulianip probabilisticlatentsemanticanalysisappliedtowholebacterialgenomesidentifiescommongenomicfeatures
_version_	1717768372241825792

Probabilistic Latent Semantic Analysis Applied to Whole Bacterial Genomes Identifies Common Genomic Features

Similar Items