Testing for the Consecutive Ones Property and Interval Graphs on Noisy Data─ Application to Physical Mapping and Sequence Assembly

博士 === 國立交通大學 === 資訊科學系 === 91 === The consecutive ones property and interval graph are two fundamental mathematical models for physical mapping and clone assembly. A (0,1)-matrix satisfies the consecutive ones property (COP) for the rows if there exists a column permutation such that the...

Full description

Bibliographic Details
Main Authors: Wei-Fu Lu, 呂威甫
Other Authors: Ruei-Chuan Chang
Format: Others
Language:en_US
Published: 2003
Online Access:http://ndltd.ncl.edu.tw/handle/61974029553973019649
id ndltd-TW-091NCTU0394030
record_format oai_dc
spelling ndltd-TW-091NCTU03940302016-06-22T04:14:06Z http://ndltd.ncl.edu.tw/handle/61974029553973019649 Testing for the Consecutive Ones Property and Interval Graphs on Noisy Data─ Application to Physical Mapping and Sequence Assembly 雜訊資料之連續一性質與區間圖辨識演算法─實體圖譜與序列組合之應用 Wei-Fu Lu 呂威甫 博士 國立交通大學 資訊科學系 91 The consecutive ones property and interval graph are two fundamental mathematical models for physical mapping and clone assembly. A (0,1)-matrix satisfies the consecutive ones property (COP) for the rows if there exists a column permutation such that the ones in each row of the resultant matrix are consecutive. An interval graph is the intersection graph of a collection of intervals. Booth and Lueker (1976) used PQ-trees to test the consecutive ones property and recognize interval graphs in linear time. The linear time algorithm by Booth and Lueker (1976) has a serious drawback: the data must be error-free. However, laboratory work is never flawless. Because a single error might cause map construction to fail, traditional recognition algorithms can hardly be applied on noisy data. Moreover, no straightforward extension of traditional algorithm can overcome the drawbacks. To solve these problems, a different philosophy toward algorithm design is necessary. In this thesis, we opt to maintain a stable local structure of consecutive ones matrices and interval graphs through clustering techniques to deal with errors. We do not set any “global” objective to optimize. Rather, our algorithms try to maintain the local monotone structure, namely, to minimize the deviation from the local monotone property as much as possible. Under moderate assumptions, the algorithm can accommodate the following four types of errors: false negatives, false positives, non-unique probes and chimeric clones. In case some local data is too noisy, our algorithm could likely discover that and suggest additional lab work to reduce the degree of ambiguity in that part. A unique feature of our algorithm is that, rather than forcing all probes or clones to be included and ordered in the final arrangement, our algorithm would delete some noisy information. Thus, it could produce more than one contig. The gaps are created mostly by noisy data. Ruei-Chuan Chang Wen-Lian Hsu 張瑞川 許聞廉 2003 學位論文 ; thesis 0 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立交通大學 === 資訊科學系 === 91 === The consecutive ones property and interval graph are two fundamental mathematical models for physical mapping and clone assembly. A (0,1)-matrix satisfies the consecutive ones property (COP) for the rows if there exists a column permutation such that the ones in each row of the resultant matrix are consecutive. An interval graph is the intersection graph of a collection of intervals. Booth and Lueker (1976) used PQ-trees to test the consecutive ones property and recognize interval graphs in linear time. The linear time algorithm by Booth and Lueker (1976) has a serious drawback: the data must be error-free. However, laboratory work is never flawless. Because a single error might cause map construction to fail, traditional recognition algorithms can hardly be applied on noisy data. Moreover, no straightforward extension of traditional algorithm can overcome the drawbacks. To solve these problems, a different philosophy toward algorithm design is necessary. In this thesis, we opt to maintain a stable local structure of consecutive ones matrices and interval graphs through clustering techniques to deal with errors. We do not set any “global” objective to optimize. Rather, our algorithms try to maintain the local monotone structure, namely, to minimize the deviation from the local monotone property as much as possible. Under moderate assumptions, the algorithm can accommodate the following four types of errors: false negatives, false positives, non-unique probes and chimeric clones. In case some local data is too noisy, our algorithm could likely discover that and suggest additional lab work to reduce the degree of ambiguity in that part. A unique feature of our algorithm is that, rather than forcing all probes or clones to be included and ordered in the final arrangement, our algorithm would delete some noisy information. Thus, it could produce more than one contig. The gaps are created mostly by noisy data.
author2 Ruei-Chuan Chang
author_facet Ruei-Chuan Chang
Wei-Fu Lu
呂威甫
author Wei-Fu Lu
呂威甫
spellingShingle Wei-Fu Lu
呂威甫
Testing for the Consecutive Ones Property and Interval Graphs on Noisy Data─ Application to Physical Mapping and Sequence Assembly
author_sort Wei-Fu Lu
title Testing for the Consecutive Ones Property and Interval Graphs on Noisy Data─ Application to Physical Mapping and Sequence Assembly
title_short Testing for the Consecutive Ones Property and Interval Graphs on Noisy Data─ Application to Physical Mapping and Sequence Assembly
title_full Testing for the Consecutive Ones Property and Interval Graphs on Noisy Data─ Application to Physical Mapping and Sequence Assembly
title_fullStr Testing for the Consecutive Ones Property and Interval Graphs on Noisy Data─ Application to Physical Mapping and Sequence Assembly
title_full_unstemmed Testing for the Consecutive Ones Property and Interval Graphs on Noisy Data─ Application to Physical Mapping and Sequence Assembly
title_sort testing for the consecutive ones property and interval graphs on noisy data─ application to physical mapping and sequence assembly
publishDate 2003
url http://ndltd.ncl.edu.tw/handle/61974029553973019649
work_keys_str_mv AT weifulu testingfortheconsecutiveonespropertyandintervalgraphsonnoisydataapplicationtophysicalmappingandsequenceassembly
AT lǚwēifǔ testingfortheconsecutiveonespropertyandintervalgraphsonnoisydataapplicationtophysicalmappingandsequenceassembly
AT weifulu záxùnzīliàozhīliánxùyīxìngzhìyǔqūjiāntúbiànshíyǎnsuànfǎshítǐtúpǔyǔxùlièzǔhézhīyīngyòng
AT lǚwēifǔ záxùnzīliàozhīliánxùyīxìngzhìyǔqūjiāntúbiànshíyǎnsuànfǎshítǐtúpǔyǔxùlièzǔhézhīyīngyòng
_version_ 1718315039726764032