Summary: | Even though immunoglobulins are critical for immune responses and human survival, the diversity of the immunoglobulin heavy chain gene (IGH) is poorly known and mostly characterized only by serological methods. Moreover, this genomic region is not well-covered in genomic databases and genome-wide association studies due to particularities that impose technical difficulties for its analysis. Therefore, the IGH gene has never been systematically sequenced across populations. Here, we deliver an unprecedented and comprehensive characterization of the diversity of the IGHG1, IGHG2, and IGHG3 gene segments, which encode the constant region of the most abundant circulating immunoglobulins: IgG1, IgG2, and IgG3, respectively. We used Sanger sequencing to analyze 357 individuals from seven different Brazilian populations, including five Amerindian, one Japanese-descendant and one Euro-descendant population samples. We discovered 28 novel IGHG alleles and provided evidence that some of them may have been originated by gene conversion between common alleles of different gene segments. The rate of synonymous substitutions was significantly higher than the rate of the non-synonymous substitutions for IGHG1 and IGHG2 (p = 0.01 and 0.03, respectively), consistent with purifying selection. Fay and Wu's test showed significant negative values for most populations (p < 0.001), which indicates that positive selection in an adjacent position may be shaping IGHG variation by hitchhiking of variants in the vicinity, possibly the regions that encode the Ig variable regions. This study shows that the variation in the IGH gene is largely underestimated. Therefore, exploring its nucleotide diversity in populations may provide valuable information for comprehension of its evolution, its impact on diseases and vaccine research.
|