Enhancing the Lasso Approach for Developing a Survival Prediction Model Based on Gene Expression Data

In the past decade, researchers in oncology have sought to develop survival prediction models using gene expression data. The least absolute shrinkage and selection operator (lasso) has been widely used to select genes that truly correlated with a patient’s survival. The lasso selects genes for pred...

Full description

Bibliographic Details
Main Authors: Shuhei Kaneko, Akihiro Hirakawa, Chikuma Hamada
Format: Article
Language:English
Published: Hindawi Limited 2015-01-01
Series:Computational and Mathematical Methods in Medicine
Online Access:http://dx.doi.org/10.1155/2015/259474
id doaj-17dfdf1a5ab14e4f9e096bd0cd78625c
record_format Article
spelling doaj-17dfdf1a5ab14e4f9e096bd0cd78625c2020-11-25T00:03:45ZengHindawi LimitedComputational and Mathematical Methods in Medicine1748-670X1748-67182015-01-01201510.1155/2015/259474259474Enhancing the Lasso Approach for Developing a Survival Prediction Model Based on Gene Expression DataShuhei Kaneko0Akihiro Hirakawa1Chikuma Hamada2Department of Management Science, Graduate School of Engineering, Tokyo University of Science, 1-3 Kagurazaka, Shinjuku-ku, Tokyo 162-8601, JapanBiostatistics and Bioinformatics Section, Center for Advanced Medicine and Clinical Research, Nagoya University Graduate School of Medicine, 65 Tsurumai-cho, Showa-ku, Nagoya 466-8560, JapanDepartment of Management Science, Graduate School of Engineering, Tokyo University of Science, 1-3 Kagurazaka, Shinjuku-ku, Tokyo 162-8601, JapanIn the past decade, researchers in oncology have sought to develop survival prediction models using gene expression data. The least absolute shrinkage and selection operator (lasso) has been widely used to select genes that truly correlated with a patient’s survival. The lasso selects genes for prediction by shrinking a large number of coefficients of the candidate genes towards zero based on a tuning parameter that is often determined by a cross-validation (CV). However, this method can pass over (or fail to identify) true positive genes (i.e., it identifies false negatives) in certain instances, because the lasso tends to favor the development of a simple prediction model. Here, we attempt to monitor the identification of false negatives by developing a method for estimating the number of true positive (TP) genes for a series of values of a tuning parameter that assumes a mixture distribution for the lasso estimates. Using our developed method, we performed a simulation study to examine its precision in estimating the number of TP genes. Additionally, we applied our method to a real gene expression dataset and found that it was able to identify genes correlated with survival that a CV method was unable to detect.http://dx.doi.org/10.1155/2015/259474
collection DOAJ
language English
format Article
sources DOAJ
author Shuhei Kaneko
Akihiro Hirakawa
Chikuma Hamada
spellingShingle Shuhei Kaneko
Akihiro Hirakawa
Chikuma Hamada
Enhancing the Lasso Approach for Developing a Survival Prediction Model Based on Gene Expression Data
Computational and Mathematical Methods in Medicine
author_facet Shuhei Kaneko
Akihiro Hirakawa
Chikuma Hamada
author_sort Shuhei Kaneko
title Enhancing the Lasso Approach for Developing a Survival Prediction Model Based on Gene Expression Data
title_short Enhancing the Lasso Approach for Developing a Survival Prediction Model Based on Gene Expression Data
title_full Enhancing the Lasso Approach for Developing a Survival Prediction Model Based on Gene Expression Data
title_fullStr Enhancing the Lasso Approach for Developing a Survival Prediction Model Based on Gene Expression Data
title_full_unstemmed Enhancing the Lasso Approach for Developing a Survival Prediction Model Based on Gene Expression Data
title_sort enhancing the lasso approach for developing a survival prediction model based on gene expression data
publisher Hindawi Limited
series Computational and Mathematical Methods in Medicine
issn 1748-670X
1748-6718
publishDate 2015-01-01
description In the past decade, researchers in oncology have sought to develop survival prediction models using gene expression data. The least absolute shrinkage and selection operator (lasso) has been widely used to select genes that truly correlated with a patient’s survival. The lasso selects genes for prediction by shrinking a large number of coefficients of the candidate genes towards zero based on a tuning parameter that is often determined by a cross-validation (CV). However, this method can pass over (or fail to identify) true positive genes (i.e., it identifies false negatives) in certain instances, because the lasso tends to favor the development of a simple prediction model. Here, we attempt to monitor the identification of false negatives by developing a method for estimating the number of true positive (TP) genes for a series of values of a tuning parameter that assumes a mixture distribution for the lasso estimates. Using our developed method, we performed a simulation study to examine its precision in estimating the number of TP genes. Additionally, we applied our method to a real gene expression dataset and found that it was able to identify genes correlated with survival that a CV method was unable to detect.
url http://dx.doi.org/10.1155/2015/259474
work_keys_str_mv AT shuheikaneko enhancingthelassoapproachfordevelopingasurvivalpredictionmodelbasedongeneexpressiondata
AT akihirohirakawa enhancingthelassoapproachfordevelopingasurvivalpredictionmodelbasedongeneexpressiondata
AT chikumahamada enhancingthelassoapproachfordevelopingasurvivalpredictionmodelbasedongeneexpressiondata
_version_ 1725432263287504896