A Method for Generating New Datasets Based on Copy Number for Cancer Analysis

New data sources for the analysis of cancer data are rapidly supplementing the large number of gene-expression markers used for current methods of analysis. Significant among these new sources are copy number variation (CNV) datasets, which typically enumerate several hundred thousand CNVs distribut...

Full description

Bibliographic Details
Main Authors: Shinuk Kim, Mark Kon, Hyunsik Kang
Format: Article
Language:English
Published: Hindawi Limited 2015-01-01
Series:BioMed Research International
Online Access:http://dx.doi.org/10.1155/2015/467514
id doaj-b4bdcbe7e670438bb42074e0d2cdab79
record_format Article
spelling doaj-b4bdcbe7e670438bb42074e0d2cdab792020-11-24T21:06:09ZengHindawi LimitedBioMed Research International2314-61332314-61412015-01-01201510.1155/2015/467514467514A Method for Generating New Datasets Based on Copy Number for Cancer AnalysisShinuk Kim0Mark Kon1Hyunsik Kang2College of Liberal Arts, Sangmyung University, Cheonan, Chungnam 330-720, Republic of Korea Department of Mathematics and Statistics, Boston University, Boston, MA 02215, USACollege of Sport Science, Sungkyunkwan University, Suwon 440-746, Republic of Korea New data sources for the analysis of cancer data are rapidly supplementing the large number of gene-expression markers used for current methods of analysis. Significant among these new sources are copy number variation (CNV) datasets, which typically enumerate several hundred thousand CNVs distributed throughout the genome. Several useful algorithms allow systems-level analyses of such datasets. However, these rich data sources have not yet been analyzed as deeply as gene-expression data. To address this issue, the extensive toolsets used for analyzing expression data in cancerous and noncancerous tissue (e.g., gene set enrichment analysis and phenotype prediction) could be redirected to extract a great deal of predictive information from CNV data, in particular those derived from cancers. Here we present a software package capable of preprocessing standard Agilent copy number datasets into a form to which essentially all expression analysis tools can be applied. We illustrate the use of this toolset in predicting the survival time of patients with ovarian cancer or glioblastoma multiforme and also provide an analysis of gene- and pathway-level deletions in these two types of cancer.http://dx.doi.org/10.1155/2015/467514
collection DOAJ
language English
format Article
sources DOAJ
author Shinuk Kim
Mark Kon
Hyunsik Kang
spellingShingle Shinuk Kim
Mark Kon
Hyunsik Kang
A Method for Generating New Datasets Based on Copy Number for Cancer Analysis
BioMed Research International
author_facet Shinuk Kim
Mark Kon
Hyunsik Kang
author_sort Shinuk Kim
title A Method for Generating New Datasets Based on Copy Number for Cancer Analysis
title_short A Method for Generating New Datasets Based on Copy Number for Cancer Analysis
title_full A Method for Generating New Datasets Based on Copy Number for Cancer Analysis
title_fullStr A Method for Generating New Datasets Based on Copy Number for Cancer Analysis
title_full_unstemmed A Method for Generating New Datasets Based on Copy Number for Cancer Analysis
title_sort method for generating new datasets based on copy number for cancer analysis
publisher Hindawi Limited
series BioMed Research International
issn 2314-6133
2314-6141
publishDate 2015-01-01
description New data sources for the analysis of cancer data are rapidly supplementing the large number of gene-expression markers used for current methods of analysis. Significant among these new sources are copy number variation (CNV) datasets, which typically enumerate several hundred thousand CNVs distributed throughout the genome. Several useful algorithms allow systems-level analyses of such datasets. However, these rich data sources have not yet been analyzed as deeply as gene-expression data. To address this issue, the extensive toolsets used for analyzing expression data in cancerous and noncancerous tissue (e.g., gene set enrichment analysis and phenotype prediction) could be redirected to extract a great deal of predictive information from CNV data, in particular those derived from cancers. Here we present a software package capable of preprocessing standard Agilent copy number datasets into a form to which essentially all expression analysis tools can be applied. We illustrate the use of this toolset in predicting the survival time of patients with ovarian cancer or glioblastoma multiforme and also provide an analysis of gene- and pathway-level deletions in these two types of cancer.
url http://dx.doi.org/10.1155/2015/467514
work_keys_str_mv AT shinukkim amethodforgeneratingnewdatasetsbasedoncopynumberforcanceranalysis
AT markkon amethodforgeneratingnewdatasetsbasedoncopynumberforcanceranalysis
AT hyunsikkang amethodforgeneratingnewdatasetsbasedoncopynumberforcanceranalysis
AT shinukkim methodforgeneratingnewdatasetsbasedoncopynumberforcanceranalysis
AT markkon methodforgeneratingnewdatasetsbasedoncopynumberforcanceranalysis
AT hyunsikkang methodforgeneratingnewdatasetsbasedoncopynumberforcanceranalysis
_version_ 1716766639345631232