Cancer Outlier Analysis Based on Mixture Modeling of Gene Expression Data

Molecular heterogeneity of cancer, partially caused by various chromosomal aberrations or gene mutations, can yield substantial heterogeneity in gene expression profile in cancer samples. To detect cancer-related genes which are active only in a subset of cancer samples or cancer outliers, several m...

Full description

Bibliographic Details
Main Authors: Keita Mori, Tomonori Oura, Hisashi Noma, Shigeyuki Matsui
Format: Article
Language:English
Published: Hindawi Limited 2013-01-01
Series:Computational and Mathematical Methods in Medicine
Online Access:http://dx.doi.org/10.1155/2013/693901
id doaj-5756f04e664e4448935b6962c17e6265
record_format Article
spelling doaj-5756f04e664e4448935b6962c17e62652020-11-24T22:49:02ZengHindawi LimitedComputational and Mathematical Methods in Medicine1748-670X1748-67182013-01-01201310.1155/2013/693901693901Cancer Outlier Analysis Based on Mixture Modeling of Gene Expression DataKeita Mori0Tomonori Oura1Hisashi Noma2Shigeyuki Matsui3Department of Statistical Science, School of Multidisciplinary Sciences, The Graduate University for Advanced Studies, 10-3 Midori-cho, Tachikawa, Tokyo 190-8562, JapanAsia-Pacific Statistical Sciences, Lilly Research Laboratories Development Center of Excellence Asia Pacific, Eli Lilly Japan K. K. Sannomiya Plaza Building 7-1-5 Isogamidori, Chuo-ku, Kobe, Hyogo 651-0086, JapanDepartment of Data Science, The Institute of Statistical Mathematics, 10-3 Midori-cho, Tachikawa, Tokyo 190-8562, JapanDepartment of Statistical Science, School of Multidisciplinary Sciences, The Graduate University for Advanced Studies, 10-3 Midori-cho, Tachikawa, Tokyo 190-8562, JapanMolecular heterogeneity of cancer, partially caused by various chromosomal aberrations or gene mutations, can yield substantial heterogeneity in gene expression profile in cancer samples. To detect cancer-related genes which are active only in a subset of cancer samples or cancer outliers, several methods have been proposed in the context of multiple testing. Such cancer outlier analyses will generally suffer from a serious lack of power, compared with the standard multiple testing setting where common activation of genes across all cancer samples is supposed. In this paper, we consider information sharing across genes and cancer samples, via a parametric normal mixture modeling of gene expression levels of cancer samples across genes after a standardization using the reference, normal sample data. A gene-based statistic for gene selection is developed on the basis of a posterior probability of cancer outlier for each cancer sample. Some efficiency improvement by using our method was demonstrated, even under settings with misspecified, heavy-tailed t-distributions. An application to a real dataset from hematologic malignancies is provided.http://dx.doi.org/10.1155/2013/693901
collection DOAJ
language English
format Article
sources DOAJ
author Keita Mori
Tomonori Oura
Hisashi Noma
Shigeyuki Matsui
spellingShingle Keita Mori
Tomonori Oura
Hisashi Noma
Shigeyuki Matsui
Cancer Outlier Analysis Based on Mixture Modeling of Gene Expression Data
Computational and Mathematical Methods in Medicine
author_facet Keita Mori
Tomonori Oura
Hisashi Noma
Shigeyuki Matsui
author_sort Keita Mori
title Cancer Outlier Analysis Based on Mixture Modeling of Gene Expression Data
title_short Cancer Outlier Analysis Based on Mixture Modeling of Gene Expression Data
title_full Cancer Outlier Analysis Based on Mixture Modeling of Gene Expression Data
title_fullStr Cancer Outlier Analysis Based on Mixture Modeling of Gene Expression Data
title_full_unstemmed Cancer Outlier Analysis Based on Mixture Modeling of Gene Expression Data
title_sort cancer outlier analysis based on mixture modeling of gene expression data
publisher Hindawi Limited
series Computational and Mathematical Methods in Medicine
issn 1748-670X
1748-6718
publishDate 2013-01-01
description Molecular heterogeneity of cancer, partially caused by various chromosomal aberrations or gene mutations, can yield substantial heterogeneity in gene expression profile in cancer samples. To detect cancer-related genes which are active only in a subset of cancer samples or cancer outliers, several methods have been proposed in the context of multiple testing. Such cancer outlier analyses will generally suffer from a serious lack of power, compared with the standard multiple testing setting where common activation of genes across all cancer samples is supposed. In this paper, we consider information sharing across genes and cancer samples, via a parametric normal mixture modeling of gene expression levels of cancer samples across genes after a standardization using the reference, normal sample data. A gene-based statistic for gene selection is developed on the basis of a posterior probability of cancer outlier for each cancer sample. Some efficiency improvement by using our method was demonstrated, even under settings with misspecified, heavy-tailed t-distributions. An application to a real dataset from hematologic malignancies is provided.
url http://dx.doi.org/10.1155/2013/693901
work_keys_str_mv AT keitamori canceroutlieranalysisbasedonmixturemodelingofgeneexpressiondata
AT tomonorioura canceroutlieranalysisbasedonmixturemodelingofgeneexpressiondata
AT hisashinoma canceroutlieranalysisbasedonmixturemodelingofgeneexpressiondata
AT shigeyukimatsui canceroutlieranalysisbasedonmixturemodelingofgeneexpressiondata
_version_ 1725677443884253184