Summary: | Abstract Background Common variants have explained less than the amount of heritability expected for complex diseases, which has led to interest in less-common variants and more powerful approaches to the analysis of whole-genome scans. Because of low frequency (low statistical power), less-common variants are best analyzed using SNP-set methods such as gene-set or pathway-based analyses. However, there is as yet no clear consensus regarding how to focus in on potential risk variants following set-based analyses. We used a stepwise, telescoping approach to analyze common- and rare-variant data from the Illumina Metabochip array to assess genomic association with colorectal cancer (CRC) in the Japanese sub-population of the Multiethnic Cohort (676 cases, 7180 controls). We started with pathway analysis of SNPs that are in genes and pathways having known mechanistic roles in colorectal cancer, then focused on genes within the pathways that evidenced association with CRC, and finally assessed individual SNPs within the genes that evidenced association. Pathway SNPs downloaded from the dbSNP database were cross-matched with Metabochip SNPs and analyzed using the logistic kernel machine regression approach (logistic SNP-set kernel-machine association test, or sequence kernel association test; SKAT) and related methods. Results The TGF-β and WNT pathways were associated with all CRC, and the WNT pathway was associated with colon cancer. Individual genes demonstrating the strongest associations were TGFBR2 in the TGF-β pathway and SMAD7 (which is involved in both the TGF-β and WNT pathways). As partial validation of our approach, a known CRC risk variant in SMAD7 (in both the TGF-β and WNT pathways: rs11874392) was associated with CRC risk in our data. We also detected two novel candidate CRC risk variants (rs13075948 and rs17025857) in TGFBR2, a gene known to be associated with CRC risk. Conclusions A stepwise, telescoping approach identified some potentially novel risk variants associated with colorectal cancer, so it may be a useful method for following up on results of set-based SNP analyses. Further work is required to assess the statistical characteristics of the approach, and additional applications should aid in better clarifying its utility.
|