Development of gene-finding algorithms for fungal genomes : dealing with small datasets and leveraging comparative genomics

Thesis (M.Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2003. === Includes bibliographical references (leaves 60-62). === A computer program called FUNSCAN was developed which identifies protein coding regions in fungal genomes. Gene str...

Full description

Bibliographic Details
Main Author: Lazarovici, Allan, 1979-
Other Authors: Christopher Burge.
Format: Others
Language:English
Published: Massachusetts Institute of Technology 2006
Subjects:
Online Access:http://hdl.handle.net/1721.1/29681
id ndltd-MIT-oai-dspace.mit.edu-1721.1-29681
record_format oai_dc
spelling ndltd-MIT-oai-dspace.mit.edu-1721.1-296812019-05-02T16:24:56Z Development of gene-finding algorithms for fungal genomes : dealing with small datasets and leveraging comparative genomics Lazarovici, Allan, 1979- Christopher Burge. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (M.Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2003. Includes bibliographical references (leaves 60-62). A computer program called FUNSCAN was developed which identifies protein coding regions in fungal genomes. Gene structural and compositional properties are modeled using a Hidden Markov Model. Separate training and testing sets for FUNSCAN were obtained by aligning cDNAs from an organism to their genomic loci, generating a 'gold standard' set of annotated genes. The performance of FUNSCAN is competitive with other computer programs design to identify protein coding regions in fungal genomes. A technique called 'Training Set Augmentation' is described which can be used to train FUNSCAN when only a small training set of genes is available. Techniques that combine alignment algorithms with FUNSCAN to identify novel genes are also discussed and explored. by Allan Lazarovici. M.Eng.and S.B. 2006-03-24T16:14:44Z 2006-03-24T16:14:44Z 2003 2003 Thesis http://hdl.handle.net/1721.1/29681 53843099 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 62 leaves 2572412 bytes 2572221 bytes application/pdf application/pdf application/pdf Massachusetts Institute of Technology
collection NDLTD
language English
format Others
sources NDLTD
topic Electrical Engineering and Computer Science.
spellingShingle Electrical Engineering and Computer Science.
Lazarovici, Allan, 1979-
Development of gene-finding algorithms for fungal genomes : dealing with small datasets and leveraging comparative genomics
description Thesis (M.Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2003. === Includes bibliographical references (leaves 60-62). === A computer program called FUNSCAN was developed which identifies protein coding regions in fungal genomes. Gene structural and compositional properties are modeled using a Hidden Markov Model. Separate training and testing sets for FUNSCAN were obtained by aligning cDNAs from an organism to their genomic loci, generating a 'gold standard' set of annotated genes. The performance of FUNSCAN is competitive with other computer programs design to identify protein coding regions in fungal genomes. A technique called 'Training Set Augmentation' is described which can be used to train FUNSCAN when only a small training set of genes is available. Techniques that combine alignment algorithms with FUNSCAN to identify novel genes are also discussed and explored. === by Allan Lazarovici. === M.Eng.and S.B.
author2 Christopher Burge.
author_facet Christopher Burge.
Lazarovici, Allan, 1979-
author Lazarovici, Allan, 1979-
author_sort Lazarovici, Allan, 1979-
title Development of gene-finding algorithms for fungal genomes : dealing with small datasets and leveraging comparative genomics
title_short Development of gene-finding algorithms for fungal genomes : dealing with small datasets and leveraging comparative genomics
title_full Development of gene-finding algorithms for fungal genomes : dealing with small datasets and leveraging comparative genomics
title_fullStr Development of gene-finding algorithms for fungal genomes : dealing with small datasets and leveraging comparative genomics
title_full_unstemmed Development of gene-finding algorithms for fungal genomes : dealing with small datasets and leveraging comparative genomics
title_sort development of gene-finding algorithms for fungal genomes : dealing with small datasets and leveraging comparative genomics
publisher Massachusetts Institute of Technology
publishDate 2006
url http://hdl.handle.net/1721.1/29681
work_keys_str_mv AT lazaroviciallan1979 developmentofgenefindingalgorithmsforfungalgenomesdealingwithsmalldatasetsandleveragingcomparativegenomics
_version_ 1719040299130421248