Iterative Variable Selection for High-Dimensional Data: Prediction of Pathological Response in Triple-Negative Breast Cancer
Over the last decade, regularized regression methods have offered alternatives for performing multi-marker analysis and feature selection in a whole genome context. The process of defining a list of genes that will characterize an expression profile remains unclear. It currently relies upon advanced...
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-01-01
|
Series: | Mathematics |
Subjects: | |
Online Access: | https://www.mdpi.com/2227-7390/9/3/222 |
id |
doaj-ffd2cb097a2a4861bef04e28e3c4f182 |
---|---|
record_format |
Article |
spelling |
doaj-ffd2cb097a2a4861bef04e28e3c4f1822021-01-24T00:02:05ZengMDPI AGMathematics2227-73902021-01-01922222210.3390/math9030222Iterative Variable Selection for High-Dimensional Data: Prediction of Pathological Response in Triple-Negative Breast CancerJuan C. Laria0M. Carmen Aguilera-Morillo1Enrique Álvarez2Rosa E. Lillo3Sara López-Taruella4María del Monte-Millán5Antonio C. Picornell6Miguel Martín7Juan Romo8UC3M-BS Santander Big Data Institute, 28903 Getafe, SpainUC3M-BS Santander Big Data Institute, 28903 Getafe, SpainDepartment of Medical Oncology, Hospital General Universitario Gregorio Marañón, Instituto de Investigación Sanitaria Gregorio Marañón, 28007 Madrid, SpainUC3M-BS Santander Big Data Institute, 28903 Getafe, SpainDepartment of Medical Oncology, Hospital General Universitario Gregorio Marañón, Instituto de Investigación Sanitaria Gregorio Marañón, 28007 Madrid, SpainDepartment of Medical Oncology, Hospital General Universitario Gregorio Marañón, Instituto de Investigación Sanitaria Gregorio Marañón, 28007 Madrid, SpainDepartment of Medical Oncology, Hospital General Universitario Gregorio Marañón, Instituto de Investigación Sanitaria Gregorio Marañón, 28007 Madrid, SpainDepartment of Medical Oncology, Hospital General Universitario Gregorio Marañón, Instituto de Investigación Sanitaria Gregorio Marañón, 28007 Madrid, SpainUC3M-BS Santander Big Data Institute, 28903 Getafe, SpainOver the last decade, regularized regression methods have offered alternatives for performing multi-marker analysis and feature selection in a whole genome context. The process of defining a list of genes that will characterize an expression profile remains unclear. It currently relies upon advanced statistics and can use an agnostic point of view or include some a priori knowledge, but overfitting remains a problem. This paper introduces a methodology to deal with the variable selection and model estimation problems in the high-dimensional set-up, which can be particularly useful in the whole genome context. Results are validated using simulated data and a real dataset from a triple-negative breast cancer study.https://www.mdpi.com/2227-7390/9/3/222variable selectionhigh dimensionregularizationclassificationsparse-group lasso |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Juan C. Laria M. Carmen Aguilera-Morillo Enrique Álvarez Rosa E. Lillo Sara López-Taruella María del Monte-Millán Antonio C. Picornell Miguel Martín Juan Romo |
spellingShingle |
Juan C. Laria M. Carmen Aguilera-Morillo Enrique Álvarez Rosa E. Lillo Sara López-Taruella María del Monte-Millán Antonio C. Picornell Miguel Martín Juan Romo Iterative Variable Selection for High-Dimensional Data: Prediction of Pathological Response in Triple-Negative Breast Cancer Mathematics variable selection high dimension regularization classification sparse-group lasso |
author_facet |
Juan C. Laria M. Carmen Aguilera-Morillo Enrique Álvarez Rosa E. Lillo Sara López-Taruella María del Monte-Millán Antonio C. Picornell Miguel Martín Juan Romo |
author_sort |
Juan C. Laria |
title |
Iterative Variable Selection for High-Dimensional Data: Prediction of Pathological Response in Triple-Negative Breast Cancer |
title_short |
Iterative Variable Selection for High-Dimensional Data: Prediction of Pathological Response in Triple-Negative Breast Cancer |
title_full |
Iterative Variable Selection for High-Dimensional Data: Prediction of Pathological Response in Triple-Negative Breast Cancer |
title_fullStr |
Iterative Variable Selection for High-Dimensional Data: Prediction of Pathological Response in Triple-Negative Breast Cancer |
title_full_unstemmed |
Iterative Variable Selection for High-Dimensional Data: Prediction of Pathological Response in Triple-Negative Breast Cancer |
title_sort |
iterative variable selection for high-dimensional data: prediction of pathological response in triple-negative breast cancer |
publisher |
MDPI AG |
series |
Mathematics |
issn |
2227-7390 |
publishDate |
2021-01-01 |
description |
Over the last decade, regularized regression methods have offered alternatives for performing multi-marker analysis and feature selection in a whole genome context. The process of defining a list of genes that will characterize an expression profile remains unclear. It currently relies upon advanced statistics and can use an agnostic point of view or include some a priori knowledge, but overfitting remains a problem. This paper introduces a methodology to deal with the variable selection and model estimation problems in the high-dimensional set-up, which can be particularly useful in the whole genome context. Results are validated using simulated data and a real dataset from a triple-negative breast cancer study. |
topic |
variable selection high dimension regularization classification sparse-group lasso |
url |
https://www.mdpi.com/2227-7390/9/3/222 |
work_keys_str_mv |
AT juanclaria iterativevariableselectionforhighdimensionaldatapredictionofpathologicalresponseintriplenegativebreastcancer AT mcarmenaguileramorillo iterativevariableselectionforhighdimensionaldatapredictionofpathologicalresponseintriplenegativebreastcancer AT enriquealvarez iterativevariableselectionforhighdimensionaldatapredictionofpathologicalresponseintriplenegativebreastcancer AT rosaelillo iterativevariableselectionforhighdimensionaldatapredictionofpathologicalresponseintriplenegativebreastcancer AT saralopeztaruella iterativevariableselectionforhighdimensionaldatapredictionofpathologicalresponseintriplenegativebreastcancer AT mariadelmontemillan iterativevariableselectionforhighdimensionaldatapredictionofpathologicalresponseintriplenegativebreastcancer AT antoniocpicornell iterativevariableselectionforhighdimensionaldatapredictionofpathologicalresponseintriplenegativebreastcancer AT miguelmartin iterativevariableselectionforhighdimensionaldatapredictionofpathologicalresponseintriplenegativebreastcancer AT juanromo iterativevariableselectionforhighdimensionaldatapredictionofpathologicalresponseintriplenegativebreastcancer |
_version_ |
1724326937556418560 |