Twiner: correlation-based regularization for identifying common cancer gene signatures
Abstract Background Breast and prostate cancers are typical examples of hormone-dependent cancers, showing remarkable similarities at the hormone-related signaling pathways level, and exhibiting a high tropism to bone. While the identification of genes playing a specific role in each cancer type bri...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2019-06-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s12859-019-2937-8 |
id |
doaj-9132430b25c24db58bf75556cab7c4ff |
---|---|
record_format |
Article |
spelling |
doaj-9132430b25c24db58bf75556cab7c4ff2020-11-25T03:12:43ZengBMCBMC Bioinformatics1471-21052019-06-0120111510.1186/s12859-019-2937-8Twiner: correlation-based regularization for identifying common cancer gene signaturesMarta B. Lopes0Sandra Casimiro1Susana Vinga2Instituto de Telecomunicações, Instituto Superior Técnico, Universidade de LisboaLuis Costa Lab, Instituto de Medicina Molecular, Faculdade de Medicina da Universidade de LisboaINESC-ID, Instituto Superior Técnico, Universidade de LisboaAbstract Background Breast and prostate cancers are typical examples of hormone-dependent cancers, showing remarkable similarities at the hormone-related signaling pathways level, and exhibiting a high tropism to bone. While the identification of genes playing a specific role in each cancer type brings invaluable insights for gene therapy research by targeting disease-specific cell functions not accounted so far, identifying a common gene signature to breast and prostate cancers could unravel new targets to tackle shared hormone-dependent disease features, like bone relapse. This would potentially allow the development of new targeted therapies directed to genes regulating both cancer types, with a consequent positive impact in cancer management and health economics. Results We address the challenge of extracting gene signatures from transcriptomic data of prostate adenocarcinoma (PRAD) and breast invasive carcinoma (BRCA) samples, particularly estrogen positive (ER+), and androgen positive (AR+) triple-negative breast cancer (TNBC), using sparse logistic regression. The introduction of gene network information based on the distances between BRCA and PRAD correlation matrices is investigated, through the proposed twin networks recovery (twiner) penalty, as a strategy to ensure similarly correlated gene features in two diseases to be less penalized during the feature selection procedure. Conclusions Our analysis led to the identification of genes that show a similar correlation pattern in BRCA and PRAD transcriptomic data, and are selected as key players in the classification of breast and prostate samples into ER+ BRCA/AR+ TNBC/PRAD tumor and normal tissues, and also associated with survival time distributions. The results obtained are supported by the literature and are expected to unveil the similarities between the diseases, disclose common disease biomarkers, and help in the definition of new strategies for more effective therapies.http://link.springer.com/article/10.1186/s12859-019-2937-8Gene networkSparse logistic regressionBreast invasive carcinomaTriple-negative breast cancerProstate adenocarcinoma |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Marta B. Lopes Sandra Casimiro Susana Vinga |
spellingShingle |
Marta B. Lopes Sandra Casimiro Susana Vinga Twiner: correlation-based regularization for identifying common cancer gene signatures BMC Bioinformatics Gene network Sparse logistic regression Breast invasive carcinoma Triple-negative breast cancer Prostate adenocarcinoma |
author_facet |
Marta B. Lopes Sandra Casimiro Susana Vinga |
author_sort |
Marta B. Lopes |
title |
Twiner: correlation-based regularization for identifying common cancer gene signatures |
title_short |
Twiner: correlation-based regularization for identifying common cancer gene signatures |
title_full |
Twiner: correlation-based regularization for identifying common cancer gene signatures |
title_fullStr |
Twiner: correlation-based regularization for identifying common cancer gene signatures |
title_full_unstemmed |
Twiner: correlation-based regularization for identifying common cancer gene signatures |
title_sort |
twiner: correlation-based regularization for identifying common cancer gene signatures |
publisher |
BMC |
series |
BMC Bioinformatics |
issn |
1471-2105 |
publishDate |
2019-06-01 |
description |
Abstract Background Breast and prostate cancers are typical examples of hormone-dependent cancers, showing remarkable similarities at the hormone-related signaling pathways level, and exhibiting a high tropism to bone. While the identification of genes playing a specific role in each cancer type brings invaluable insights for gene therapy research by targeting disease-specific cell functions not accounted so far, identifying a common gene signature to breast and prostate cancers could unravel new targets to tackle shared hormone-dependent disease features, like bone relapse. This would potentially allow the development of new targeted therapies directed to genes regulating both cancer types, with a consequent positive impact in cancer management and health economics. Results We address the challenge of extracting gene signatures from transcriptomic data of prostate adenocarcinoma (PRAD) and breast invasive carcinoma (BRCA) samples, particularly estrogen positive (ER+), and androgen positive (AR+) triple-negative breast cancer (TNBC), using sparse logistic regression. The introduction of gene network information based on the distances between BRCA and PRAD correlation matrices is investigated, through the proposed twin networks recovery (twiner) penalty, as a strategy to ensure similarly correlated gene features in two diseases to be less penalized during the feature selection procedure. Conclusions Our analysis led to the identification of genes that show a similar correlation pattern in BRCA and PRAD transcriptomic data, and are selected as key players in the classification of breast and prostate samples into ER+ BRCA/AR+ TNBC/PRAD tumor and normal tissues, and also associated with survival time distributions. The results obtained are supported by the literature and are expected to unveil the similarities between the diseases, disclose common disease biomarkers, and help in the definition of new strategies for more effective therapies. |
topic |
Gene network Sparse logistic regression Breast invasive carcinoma Triple-negative breast cancer Prostate adenocarcinoma |
url |
http://link.springer.com/article/10.1186/s12859-019-2937-8 |
work_keys_str_mv |
AT martablopes twinercorrelationbasedregularizationforidentifyingcommoncancergenesignatures AT sandracasimiro twinercorrelationbasedregularizationforidentifyingcommoncancergenesignatures AT susanavinga twinercorrelationbasedregularizationforidentifyingcommoncancergenesignatures |
_version_ |
1724648953439322112 |