Bioinformatics analyses of alternative splicing, est-based and machine learning-based prediction

Master of Science === Department of Computing and Information Sciences === William H. Hsu === Alternative splicing is a mechanism for generating different gene transcripts (called iso- forms) from the same genomic sequence. Finding alternative splicing events experimentally is both expensive and...

Full description

Bibliographic Details
Main Author: Xia, Jing
Language:en_US
Published: Kansas State University 2008
Subjects:
Online Access:http://hdl.handle.net/2097/1113
Description
Summary:Master of Science === Department of Computing and Information Sciences === William H. Hsu === Alternative splicing is a mechanism for generating different gene transcripts (called iso- forms) from the same genomic sequence. Finding alternative splicing events experimentally is both expensive and time consuming. Computational methods in general, and EST analy- sis and machine learning algorithms in particular, can be used to complement experimental methods in the process of identifying alternative splicing events. In this thesis, I first iden- tify alternative splicing exons by analyzing EST-genome alignment. Next, I explore the predictive power of a rich set of features that have been experimentally shown to affect al- ternative splicing. I use these features to build support vector machine (SVM) classifiers for distinguishing between alternatively spliced exons and constitutive exons. My results show that simple, linear SVM classifiers built from a rich set of features give results comparable to those of more sophisticated SVM classifiers that use more basic sequence features. Finally, I use feature selection methods to identify computationally the most informative features for the prediction problem considered.