Summary: | Master of Science === Department of Computing and Information Sciences === William H. Hsu === Alternative splicing is a mechanism for generating different gene transcripts (called iso-
forms) from the same genomic sequence. Finding alternative splicing events experimentally
is both expensive and time consuming. Computational methods in general, and EST analy-
sis and machine learning algorithms in particular, can be used to complement experimental
methods in the process of identifying alternative splicing events. In this thesis, I first iden-
tify alternative splicing exons by analyzing EST-genome alignment. Next, I explore the
predictive power of a rich set of features that have been experimentally shown to affect al-
ternative splicing. I use these features to build support vector machine (SVM) classifiers for
distinguishing between alternatively spliced exons and constitutive exons. My results show
that simple, linear SVM classifiers built from a rich set of features give results comparable to
those of more sophisticated SVM classifiers that use more basic sequence features. Finally,
I use feature selection methods to identify computationally the most informative features
for the prediction problem considered.
|