An Efficient Approach to Screening Epigenome-Wide Data

Screening cytosine-phosphate-guanine dinucleotide (CpG) DNA methylation sites in association with some covariate(s) is desired due to high dimensionality. We incorporate surrogate variable analyses (SVAs) into (ordinary or robust) linear regressions and utilize training and testing samples for neste...

Full description

Bibliographic Details
Main Authors: Meredith A. Ray, Xin Tong, Gabrielle A. Lockett, Hongmei Zhang, Wilfried J. J. Karmaus
Format: Article
Language:English
Published: Hindawi Limited 2016-01-01
Series:BioMed Research International
Online Access:http://dx.doi.org/10.1155/2016/2615348
id doaj-639b9fd25ede45309d73589824554feb
record_format Article
spelling doaj-639b9fd25ede45309d73589824554feb2020-11-25T00:29:25ZengHindawi LimitedBioMed Research International2314-61332314-61412016-01-01201610.1155/2016/26153482615348An Efficient Approach to Screening Epigenome-Wide DataMeredith A. Ray0Xin Tong1Gabrielle A. Lockett2Hongmei Zhang3Wilfried J. J. Karmaus4Division of Epidemiology, Biostatistics, and Environmental Health, School of Public Health, University of Memphis, Zach Curlin Street, Memphis, TN 38152, USADepartment of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, 915 Green Street, Columbia, SC 29208, USAHuman Development and Health, Faculty of Medicine, University of Southampton, 801 South Academic Block Tremona Road, Southampton SO16 6YD, UKDivision of Epidemiology, Biostatistics, and Environmental Health, School of Public Health, University of Memphis, Zach Curlin Street, Memphis, TN 38152, USADivision of Epidemiology, Biostatistics, and Environmental Health, School of Public Health, University of Memphis, Zach Curlin Street, Memphis, TN 38152, USAScreening cytosine-phosphate-guanine dinucleotide (CpG) DNA methylation sites in association with some covariate(s) is desired due to high dimensionality. We incorporate surrogate variable analyses (SVAs) into (ordinary or robust) linear regressions and utilize training and testing samples for nested validation to screen CpG sites. SVA is to account for variations in the methylation not explained by the specified covariate(s) and adjust for confounding effects. To make it easier to users, this screening method is built into a user-friendly R package, ttScreening, with efficient algorithms implemented. Various simulations were implemented to examine the robustness and sensitivity of the method compared to the classical approaches controlling for multiple testing: the false discovery rates-based (FDR-based) and the Bonferroni-based methods. The proposed approach in general performs better and has the potential to control both types I and II errors. We applied ttScreening to 383,998 CpG sites in association with maternal smoking, one of the leading factors for cancer risk.http://dx.doi.org/10.1155/2016/2615348
collection DOAJ
language English
format Article
sources DOAJ
author Meredith A. Ray
Xin Tong
Gabrielle A. Lockett
Hongmei Zhang
Wilfried J. J. Karmaus
spellingShingle Meredith A. Ray
Xin Tong
Gabrielle A. Lockett
Hongmei Zhang
Wilfried J. J. Karmaus
An Efficient Approach to Screening Epigenome-Wide Data
BioMed Research International
author_facet Meredith A. Ray
Xin Tong
Gabrielle A. Lockett
Hongmei Zhang
Wilfried J. J. Karmaus
author_sort Meredith A. Ray
title An Efficient Approach to Screening Epigenome-Wide Data
title_short An Efficient Approach to Screening Epigenome-Wide Data
title_full An Efficient Approach to Screening Epigenome-Wide Data
title_fullStr An Efficient Approach to Screening Epigenome-Wide Data
title_full_unstemmed An Efficient Approach to Screening Epigenome-Wide Data
title_sort efficient approach to screening epigenome-wide data
publisher Hindawi Limited
series BioMed Research International
issn 2314-6133
2314-6141
publishDate 2016-01-01
description Screening cytosine-phosphate-guanine dinucleotide (CpG) DNA methylation sites in association with some covariate(s) is desired due to high dimensionality. We incorporate surrogate variable analyses (SVAs) into (ordinary or robust) linear regressions and utilize training and testing samples for nested validation to screen CpG sites. SVA is to account for variations in the methylation not explained by the specified covariate(s) and adjust for confounding effects. To make it easier to users, this screening method is built into a user-friendly R package, ttScreening, with efficient algorithms implemented. Various simulations were implemented to examine the robustness and sensitivity of the method compared to the classical approaches controlling for multiple testing: the false discovery rates-based (FDR-based) and the Bonferroni-based methods. The proposed approach in general performs better and has the potential to control both types I and II errors. We applied ttScreening to 383,998 CpG sites in association with maternal smoking, one of the leading factors for cancer risk.
url http://dx.doi.org/10.1155/2016/2615348
work_keys_str_mv AT mereditharay anefficientapproachtoscreeningepigenomewidedata
AT xintong anefficientapproachtoscreeningepigenomewidedata
AT gabriellealockett anefficientapproachtoscreeningepigenomewidedata
AT hongmeizhang anefficientapproachtoscreeningepigenomewidedata
AT wilfriedjjkarmaus anefficientapproachtoscreeningepigenomewidedata
AT mereditharay efficientapproachtoscreeningepigenomewidedata
AT xintong efficientapproachtoscreeningepigenomewidedata
AT gabriellealockett efficientapproachtoscreeningepigenomewidedata
AT hongmeizhang efficientapproachtoscreeningepigenomewidedata
AT wilfriedjjkarmaus efficientapproachtoscreeningepigenomewidedata
_version_ 1725331414862266368