An Efficient Approach to Screening Epigenome-Wide Data
Screening cytosine-phosphate-guanine dinucleotide (CpG) DNA methylation sites in association with some covariate(s) is desired due to high dimensionality. We incorporate surrogate variable analyses (SVAs) into (ordinary or robust) linear regressions and utilize training and testing samples for neste...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Hindawi Limited
2016-01-01
|
Series: | BioMed Research International |
Online Access: | http://dx.doi.org/10.1155/2016/2615348 |
id |
doaj-639b9fd25ede45309d73589824554feb |
---|---|
record_format |
Article |
spelling |
doaj-639b9fd25ede45309d73589824554feb2020-11-25T00:29:25ZengHindawi LimitedBioMed Research International2314-61332314-61412016-01-01201610.1155/2016/26153482615348An Efficient Approach to Screening Epigenome-Wide DataMeredith A. Ray0Xin Tong1Gabrielle A. Lockett2Hongmei Zhang3Wilfried J. J. Karmaus4Division of Epidemiology, Biostatistics, and Environmental Health, School of Public Health, University of Memphis, Zach Curlin Street, Memphis, TN 38152, USADepartment of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, 915 Green Street, Columbia, SC 29208, USAHuman Development and Health, Faculty of Medicine, University of Southampton, 801 South Academic Block Tremona Road, Southampton SO16 6YD, UKDivision of Epidemiology, Biostatistics, and Environmental Health, School of Public Health, University of Memphis, Zach Curlin Street, Memphis, TN 38152, USADivision of Epidemiology, Biostatistics, and Environmental Health, School of Public Health, University of Memphis, Zach Curlin Street, Memphis, TN 38152, USAScreening cytosine-phosphate-guanine dinucleotide (CpG) DNA methylation sites in association with some covariate(s) is desired due to high dimensionality. We incorporate surrogate variable analyses (SVAs) into (ordinary or robust) linear regressions and utilize training and testing samples for nested validation to screen CpG sites. SVA is to account for variations in the methylation not explained by the specified covariate(s) and adjust for confounding effects. To make it easier to users, this screening method is built into a user-friendly R package, ttScreening, with efficient algorithms implemented. Various simulations were implemented to examine the robustness and sensitivity of the method compared to the classical approaches controlling for multiple testing: the false discovery rates-based (FDR-based) and the Bonferroni-based methods. The proposed approach in general performs better and has the potential to control both types I and II errors. We applied ttScreening to 383,998 CpG sites in association with maternal smoking, one of the leading factors for cancer risk.http://dx.doi.org/10.1155/2016/2615348 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Meredith A. Ray Xin Tong Gabrielle A. Lockett Hongmei Zhang Wilfried J. J. Karmaus |
spellingShingle |
Meredith A. Ray Xin Tong Gabrielle A. Lockett Hongmei Zhang Wilfried J. J. Karmaus An Efficient Approach to Screening Epigenome-Wide Data BioMed Research International |
author_facet |
Meredith A. Ray Xin Tong Gabrielle A. Lockett Hongmei Zhang Wilfried J. J. Karmaus |
author_sort |
Meredith A. Ray |
title |
An Efficient Approach to Screening Epigenome-Wide Data |
title_short |
An Efficient Approach to Screening Epigenome-Wide Data |
title_full |
An Efficient Approach to Screening Epigenome-Wide Data |
title_fullStr |
An Efficient Approach to Screening Epigenome-Wide Data |
title_full_unstemmed |
An Efficient Approach to Screening Epigenome-Wide Data |
title_sort |
efficient approach to screening epigenome-wide data |
publisher |
Hindawi Limited |
series |
BioMed Research International |
issn |
2314-6133 2314-6141 |
publishDate |
2016-01-01 |
description |
Screening cytosine-phosphate-guanine dinucleotide (CpG) DNA methylation sites in association with some covariate(s) is desired due to high dimensionality. We incorporate surrogate variable analyses (SVAs) into (ordinary or robust) linear regressions and utilize training and testing samples for nested validation to screen CpG sites. SVA is to account for variations in the methylation not explained by the specified covariate(s) and adjust for confounding effects. To make it easier to users, this screening method is built into a user-friendly R package, ttScreening, with efficient algorithms implemented. Various simulations were implemented to examine the robustness and sensitivity of the method compared to the classical approaches controlling for multiple testing: the false discovery rates-based (FDR-based) and the Bonferroni-based methods. The proposed approach in general performs better and has the potential to control both types I and II errors. We applied ttScreening to 383,998 CpG sites in association with maternal smoking, one of the leading factors for cancer risk. |
url |
http://dx.doi.org/10.1155/2016/2615348 |
work_keys_str_mv |
AT mereditharay anefficientapproachtoscreeningepigenomewidedata AT xintong anefficientapproachtoscreeningepigenomewidedata AT gabriellealockett anefficientapproachtoscreeningepigenomewidedata AT hongmeizhang anefficientapproachtoscreeningepigenomewidedata AT wilfriedjjkarmaus anefficientapproachtoscreeningepigenomewidedata AT mereditharay efficientapproachtoscreeningepigenomewidedata AT xintong efficientapproachtoscreeningepigenomewidedata AT gabriellealockett efficientapproachtoscreeningepigenomewidedata AT hongmeizhang efficientapproachtoscreeningepigenomewidedata AT wilfriedjjkarmaus efficientapproachtoscreeningepigenomewidedata |
_version_ |
1725331414862266368 |