Trainable high resolution melt curve machine learning classifier for large-scale reliable genotyping of sequence variants.
High resolution melt (HRM) is gaining considerable popularity as a simple and robust method for genotyping sequence variants. However, accurate genotyping of an unknown sample for which a large number of possible variants may exist will require an automated HRM curve identification method capable of...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2014-01-01
|
Series: | PLoS ONE |
Online Access: | http://europepmc.org/articles/PMC4183555?pdf=render |
id |
doaj-d7649fa0c06545d5a0c9201004aa0fc9 |
---|---|
record_format |
Article |
spelling |
doaj-d7649fa0c06545d5a0c9201004aa0fc92020-11-25T01:55:53ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-0199e10909410.1371/journal.pone.0109094Trainable high resolution melt curve machine learning classifier for large-scale reliable genotyping of sequence variants.Pornpat AthamanolapVishwa ParekhStephanie I FraleyVatsal AgarwalDong J ShinMichael A JacobsTza-Huei WangSamuel YangHigh resolution melt (HRM) is gaining considerable popularity as a simple and robust method for genotyping sequence variants. However, accurate genotyping of an unknown sample for which a large number of possible variants may exist will require an automated HRM curve identification method capable of comparing unknowns against a large cohort of known sequence variants. Herein, we describe a new method for automated HRM curve classification based on machine learning methods and learned tolerance for reaction condition deviations. We tested this method in silico through multiple cross-validations using curves generated from 9 different simulated experimental conditions to classify 92 known serotypes of Streptococcus pneumoniae and demonstrated over 99% accuracy with 8 training curves per serotype. In vitro verification of the algorithm was tested using sequence variants of a cancer-related gene and demonstrated 100% accuracy with 3 training curves per sequence variant. The machine learning algorithm enabled reliable, scalable, and automated HRM genotyping analysis with broad potential clinical and epidemiological applications.http://europepmc.org/articles/PMC4183555?pdf=render |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Pornpat Athamanolap Vishwa Parekh Stephanie I Fraley Vatsal Agarwal Dong J Shin Michael A Jacobs Tza-Huei Wang Samuel Yang |
spellingShingle |
Pornpat Athamanolap Vishwa Parekh Stephanie I Fraley Vatsal Agarwal Dong J Shin Michael A Jacobs Tza-Huei Wang Samuel Yang Trainable high resolution melt curve machine learning classifier for large-scale reliable genotyping of sequence variants. PLoS ONE |
author_facet |
Pornpat Athamanolap Vishwa Parekh Stephanie I Fraley Vatsal Agarwal Dong J Shin Michael A Jacobs Tza-Huei Wang Samuel Yang |
author_sort |
Pornpat Athamanolap |
title |
Trainable high resolution melt curve machine learning classifier for large-scale reliable genotyping of sequence variants. |
title_short |
Trainable high resolution melt curve machine learning classifier for large-scale reliable genotyping of sequence variants. |
title_full |
Trainable high resolution melt curve machine learning classifier for large-scale reliable genotyping of sequence variants. |
title_fullStr |
Trainable high resolution melt curve machine learning classifier for large-scale reliable genotyping of sequence variants. |
title_full_unstemmed |
Trainable high resolution melt curve machine learning classifier for large-scale reliable genotyping of sequence variants. |
title_sort |
trainable high resolution melt curve machine learning classifier for large-scale reliable genotyping of sequence variants. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS ONE |
issn |
1932-6203 |
publishDate |
2014-01-01 |
description |
High resolution melt (HRM) is gaining considerable popularity as a simple and robust method for genotyping sequence variants. However, accurate genotyping of an unknown sample for which a large number of possible variants may exist will require an automated HRM curve identification method capable of comparing unknowns against a large cohort of known sequence variants. Herein, we describe a new method for automated HRM curve classification based on machine learning methods and learned tolerance for reaction condition deviations. We tested this method in silico through multiple cross-validations using curves generated from 9 different simulated experimental conditions to classify 92 known serotypes of Streptococcus pneumoniae and demonstrated over 99% accuracy with 8 training curves per serotype. In vitro verification of the algorithm was tested using sequence variants of a cancer-related gene and demonstrated 100% accuracy with 3 training curves per sequence variant. The machine learning algorithm enabled reliable, scalable, and automated HRM genotyping analysis with broad potential clinical and epidemiological applications. |
url |
http://europepmc.org/articles/PMC4183555?pdf=render |
work_keys_str_mv |
AT pornpatathamanolap trainablehighresolutionmeltcurvemachinelearningclassifierforlargescalereliablegenotypingofsequencevariants AT vishwaparekh trainablehighresolutionmeltcurvemachinelearningclassifierforlargescalereliablegenotypingofsequencevariants AT stephanieifraley trainablehighresolutionmeltcurvemachinelearningclassifierforlargescalereliablegenotypingofsequencevariants AT vatsalagarwal trainablehighresolutionmeltcurvemachinelearningclassifierforlargescalereliablegenotypingofsequencevariants AT dongjshin trainablehighresolutionmeltcurvemachinelearningclassifierforlargescalereliablegenotypingofsequencevariants AT michaelajacobs trainablehighresolutionmeltcurvemachinelearningclassifierforlargescalereliablegenotypingofsequencevariants AT tzahueiwang trainablehighresolutionmeltcurvemachinelearningclassifierforlargescalereliablegenotypingofsequencevariants AT samuelyang trainablehighresolutionmeltcurvemachinelearningclassifierforlargescalereliablegenotypingofsequencevariants |
_version_ |
1724982816653967360 |