Joint base-calling of Two DNA Sequences with Factor Graphs

Automated estimation of DNA base-sequences is an important step in genomics and in many other emerging fields in biological and medical sciences. Current automated sequencers process single strands only. To improve the utility of existing technologies, we propose to mix two independent strands prior...

Full description

Bibliographic Details
Main Authors: Shi, Xiaomeng (Contributor), Lun, Desmond S. (Author), Medard, Muriel (Contributor), Koetter, Ralf (Author), Meldrim, James C. (Author), Barry, Andrew James (Contributor)
Other Authors: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science (Contributor)
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers, 2011-04-01T16:36:51Z.
Subjects:
Online Access:Get fulltext
LEADER 01895 am a22002773u 4500
001 62009
042 |a dc 
100 1 0 |a Shi, Xiaomeng  |e author 
100 1 0 |a Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science  |e contributor 
100 1 0 |a Medard, Muriel  |e contributor 
100 1 0 |a Shi, Xiaomeng  |e contributor 
100 1 0 |a Medard, Muriel  |e contributor 
100 1 0 |a Barry, Andrew James  |e contributor 
700 1 0 |a Lun, Desmond S.  |e author 
700 1 0 |a Medard, Muriel  |e author 
700 1 0 |a Koetter, Ralf  |e author 
700 1 0 |a Meldrim, James C.  |e author 
700 1 0 |a Barry, Andrew James  |e author 
245 0 0 |a Joint base-calling of Two DNA Sequences with Factor Graphs 
260 |b Institute of Electrical and Electronics Engineers,   |c 2011-04-01T16:36:51Z. 
856 |z Get fulltext  |u http://hdl.handle.net/1721.1/62009 
520 |a Automated estimation of DNA base-sequences is an important step in genomics and in many other emerging fields in biological and medical sciences. Current automated sequencers process single strands only. To improve the utility of existing technologies, we propose to mix two independent strands prior to electrophoresis, and base-call jointly by applying the sum-product algorithm on factor graphs. We first present a statistical model for DNA sequencing data and examine the model parameters. A practical heuristic is then proposed to estimate the peaks, which are then separated into two source sequences (Major/Minor) by passing messages on a factor graph. Simulation results show that joint base-calling can provide less accurate but valid results for the minor. The algorithm presented provides a basis for future investigation of joint sequencing techniques. 
520 |a National Science Foundation (U.S.) (Grant CCR-0325496) 
546 |a en_US 
655 7 |a Article 
773 |t IEEE transactions on information theory