Few amino acid positions in <it>rpoB </it>are associated with most of the rifampin resistance in <it>Mycobacterium tuberculosis</it>

<p>Abstract</p> <p>Background</p> <p>Mutations in <it>rpoB</it>, the gene encoding the <it>β </it>subunit of DNA-dependent RNA polymerase, are associated with rifampin resistance in <it>Mycobacterium tuberculosis</it>. Several studies...

Full description

Bibliographic Details
Main Authors: Segal Mark R, Cummings Michael P
Format: Article
Language:English
Published: BMC 2004-09-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/5/137
id doaj-96fcd083769543b6bbed08dee5b75a0b
record_format Article
spelling doaj-96fcd083769543b6bbed08dee5b75a0b2020-11-25T00:04:46ZengBMCBMC Bioinformatics1471-21052004-09-015113710.1186/1471-2105-5-137Few amino acid positions in <it>rpoB </it>are associated with most of the rifampin resistance in <it>Mycobacterium tuberculosis</it>Segal Mark RCummings Michael P<p>Abstract</p> <p>Background</p> <p>Mutations in <it>rpoB</it>, the gene encoding the <it>β </it>subunit of DNA-dependent RNA polymerase, are associated with rifampin resistance in <it>Mycobacterium tuberculosis</it>. Several studies have been conducted where minimum inhibitory concentration (MIC, which is defined as the minimum concentration of the antibiotic in a given culture medium below which bacterial growth is not inhibited) of rifampin has been measured and partial DNA sequences have been determined for <it>rpoB </it>in different isolates of <it>M. tuberculosis</it>. However, no model has been constructed to predict rifampin resistance based on sequence information alone. Such a model might provide the basis for quantifying rifampin resistance status based exclusively on DNA sequence data and thus eliminate the requirements for time consuming culturing and antibiotic testing of clinical isolates.</p> <p>Results</p> <p>Sequence data for amino acid positions 511–533 of <it>rpoB </it>and associated MIC of rifampin for different isolates of <it>M. tuberculosis </it>were taken from studies examining rifampin resistance in clinical samples from New York City and throughout Japan. We used tree-based statistical methods and random forests to generate models of the relationships between <it>rpoB </it>amino acid sequence and rifampin resistance. The proportion of variance explained by a relatively simple tree-based cross-validated regression model involving two amino acid positions (526 and 531) is 0.679. The first partition in the data, based on position 531, results in groups that differ one hundredfold in mean MIC (1.596 <it>μg/ml </it>and 159.676 <it>μg/ml</it>). The subsequent partition based on position 526, the most variable in this region, results in a > 354-fold difference in MIC. When considered as a classification problem (susceptible or resistant), a cross-validated tree-based model correctly classified most (0.884) of the observations and was very similar to the regression model. Random forest analysis of the MIC data as a continuous variable, a regression problem, produced a model that explained 0.861 of the variance. The random forest analysis of the MIC data as discrete classes produced a model that correctly classified 0.942 of the observations with sensitivity of 0.958 and specificity of 0.885.</p> <p>Conclusions</p> <p>Highly accurate regression and classification models of rifampin resistance can be made based on this short sequence region. Models may be better with improved (and consistent) measurements of MIC and more sequence data.</p> http://www.biomedcentral.com/1471-2105/5/137
collection DOAJ
language English
format Article
sources DOAJ
author Segal Mark R
Cummings Michael P
spellingShingle Segal Mark R
Cummings Michael P
Few amino acid positions in <it>rpoB </it>are associated with most of the rifampin resistance in <it>Mycobacterium tuberculosis</it>
BMC Bioinformatics
author_facet Segal Mark R
Cummings Michael P
author_sort Segal Mark R
title Few amino acid positions in <it>rpoB </it>are associated with most of the rifampin resistance in <it>Mycobacterium tuberculosis</it>
title_short Few amino acid positions in <it>rpoB </it>are associated with most of the rifampin resistance in <it>Mycobacterium tuberculosis</it>
title_full Few amino acid positions in <it>rpoB </it>are associated with most of the rifampin resistance in <it>Mycobacterium tuberculosis</it>
title_fullStr Few amino acid positions in <it>rpoB </it>are associated with most of the rifampin resistance in <it>Mycobacterium tuberculosis</it>
title_full_unstemmed Few amino acid positions in <it>rpoB </it>are associated with most of the rifampin resistance in <it>Mycobacterium tuberculosis</it>
title_sort few amino acid positions in <it>rpob </it>are associated with most of the rifampin resistance in <it>mycobacterium tuberculosis</it>
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2004-09-01
description <p>Abstract</p> <p>Background</p> <p>Mutations in <it>rpoB</it>, the gene encoding the <it>β </it>subunit of DNA-dependent RNA polymerase, are associated with rifampin resistance in <it>Mycobacterium tuberculosis</it>. Several studies have been conducted where minimum inhibitory concentration (MIC, which is defined as the minimum concentration of the antibiotic in a given culture medium below which bacterial growth is not inhibited) of rifampin has been measured and partial DNA sequences have been determined for <it>rpoB </it>in different isolates of <it>M. tuberculosis</it>. However, no model has been constructed to predict rifampin resistance based on sequence information alone. Such a model might provide the basis for quantifying rifampin resistance status based exclusively on DNA sequence data and thus eliminate the requirements for time consuming culturing and antibiotic testing of clinical isolates.</p> <p>Results</p> <p>Sequence data for amino acid positions 511–533 of <it>rpoB </it>and associated MIC of rifampin for different isolates of <it>M. tuberculosis </it>were taken from studies examining rifampin resistance in clinical samples from New York City and throughout Japan. We used tree-based statistical methods and random forests to generate models of the relationships between <it>rpoB </it>amino acid sequence and rifampin resistance. The proportion of variance explained by a relatively simple tree-based cross-validated regression model involving two amino acid positions (526 and 531) is 0.679. The first partition in the data, based on position 531, results in groups that differ one hundredfold in mean MIC (1.596 <it>μg/ml </it>and 159.676 <it>μg/ml</it>). The subsequent partition based on position 526, the most variable in this region, results in a > 354-fold difference in MIC. When considered as a classification problem (susceptible or resistant), a cross-validated tree-based model correctly classified most (0.884) of the observations and was very similar to the regression model. Random forest analysis of the MIC data as a continuous variable, a regression problem, produced a model that explained 0.861 of the variance. The random forest analysis of the MIC data as discrete classes produced a model that correctly classified 0.942 of the observations with sensitivity of 0.958 and specificity of 0.885.</p> <p>Conclusions</p> <p>Highly accurate regression and classification models of rifampin resistance can be made based on this short sequence region. Models may be better with improved (and consistent) measurements of MIC and more sequence data.</p>
url http://www.biomedcentral.com/1471-2105/5/137
work_keys_str_mv AT segalmarkr fewaminoacidpositionsinitrpobitareassociatedwithmostoftherifampinresistanceinitmycobacteriumtuberculosisit
AT cummingsmichaelp fewaminoacidpositionsinitrpobitareassociatedwithmostoftherifampinresistanceinitmycobacteriumtuberculosisit
_version_ 1725428060500525056