Detecting key structural features within highly recombined genes.

Many microorganisms exhibit high levels of intragenic recombination following horizontal gene transfer events. Furthermore, many microbial genes are subject to strong diversifying selection as part of the pathogenic process. A multiple sequence alignment is an essential starting point for many of th...

Full description

Bibliographic Details
Main Authors: John E Wertz, Karen F McGregor, Debra E Bessen
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2007-01-01
Series:PLoS Computational Biology
Online Access:http://europepmc.org/articles/PMC1782043?pdf=render
id doaj-f705880e4a4d48629fdb067fd88fa9fe
record_format Article
spelling doaj-f705880e4a4d48629fdb067fd88fa9fe2020-11-24T21:51:04ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582007-01-0131e1410.1371/journal.pcbi.0030014Detecting key structural features within highly recombined genes.John E WertzKaren F McGregorDebra E BessenMany microorganisms exhibit high levels of intragenic recombination following horizontal gene transfer events. Furthermore, many microbial genes are subject to strong diversifying selection as part of the pathogenic process. A multiple sequence alignment is an essential starting point for many of the tools that provide fundamental insights on gene structure and evolution, such as phylogenetics; however, an accurate alignment is not always possible to attain. In this study, a new analytic approach was developed in order to better quantify the genetic organization of highly diversified genes whose alleles do not align. This BLAST-based method, denoted BLAST Miner, employs an iterative process that places short segments of highly similar sequence into discrete datasets that are designated "modules." The relative positions of modules along the length of the genes, and their frequency of occurrence, are used to identify sequence duplications, insertions, and rearrangements. Partial alleles of sof from Streptococcus pyogenes, encoding a surface protein under host immune selection, were analyzed for module content. High-frequency Modules 6 and 13 were identified and examined in depth. Nucleotide sequences corresponding to both modules contain numerous duplications and inverted repeats, whereby many codons form palindromic pairs. Combined with evidence for a strong codon usage bias, data suggest that Module 6 and 13 sequences are under selection to preserve their nucleic acid secondary structure. The concentration of overlapping tandem and inverted repeats within a small region of DNA is highly suggestive of a mechanistic role for Module 6 and 13 sequences in promoting aberrant recombination. Analysis of pbp2X alleles from Streptococcus pneumoniae, encoding cell wall enzymes that confer antibiotic resistance, supports the broad applicability of this tool in deciphering the genetic organization of highly recombined genes. BLAST Miner shares with phylogenetics the important predictive quality that leads to the generation of testable hypotheses based on sequence data.http://europepmc.org/articles/PMC1782043?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author John E Wertz
Karen F McGregor
Debra E Bessen
spellingShingle John E Wertz
Karen F McGregor
Debra E Bessen
Detecting key structural features within highly recombined genes.
PLoS Computational Biology
author_facet John E Wertz
Karen F McGregor
Debra E Bessen
author_sort John E Wertz
title Detecting key structural features within highly recombined genes.
title_short Detecting key structural features within highly recombined genes.
title_full Detecting key structural features within highly recombined genes.
title_fullStr Detecting key structural features within highly recombined genes.
title_full_unstemmed Detecting key structural features within highly recombined genes.
title_sort detecting key structural features within highly recombined genes.
publisher Public Library of Science (PLoS)
series PLoS Computational Biology
issn 1553-734X
1553-7358
publishDate 2007-01-01
description Many microorganisms exhibit high levels of intragenic recombination following horizontal gene transfer events. Furthermore, many microbial genes are subject to strong diversifying selection as part of the pathogenic process. A multiple sequence alignment is an essential starting point for many of the tools that provide fundamental insights on gene structure and evolution, such as phylogenetics; however, an accurate alignment is not always possible to attain. In this study, a new analytic approach was developed in order to better quantify the genetic organization of highly diversified genes whose alleles do not align. This BLAST-based method, denoted BLAST Miner, employs an iterative process that places short segments of highly similar sequence into discrete datasets that are designated "modules." The relative positions of modules along the length of the genes, and their frequency of occurrence, are used to identify sequence duplications, insertions, and rearrangements. Partial alleles of sof from Streptococcus pyogenes, encoding a surface protein under host immune selection, were analyzed for module content. High-frequency Modules 6 and 13 were identified and examined in depth. Nucleotide sequences corresponding to both modules contain numerous duplications and inverted repeats, whereby many codons form palindromic pairs. Combined with evidence for a strong codon usage bias, data suggest that Module 6 and 13 sequences are under selection to preserve their nucleic acid secondary structure. The concentration of overlapping tandem and inverted repeats within a small region of DNA is highly suggestive of a mechanistic role for Module 6 and 13 sequences in promoting aberrant recombination. Analysis of pbp2X alleles from Streptococcus pneumoniae, encoding cell wall enzymes that confer antibiotic resistance, supports the broad applicability of this tool in deciphering the genetic organization of highly recombined genes. BLAST Miner shares with phylogenetics the important predictive quality that leads to the generation of testable hypotheses based on sequence data.
url http://europepmc.org/articles/PMC1782043?pdf=render
work_keys_str_mv AT johnewertz detectingkeystructuralfeatureswithinhighlyrecombinedgenes
AT karenfmcgregor detectingkeystructuralfeatureswithinhighlyrecombinedgenes
AT debraebessen detectingkeystructuralfeatureswithinhighlyrecombinedgenes
_version_ 1725880663831216128