Twiner: correlation-based regularization for identifying common cancer gene signatures

Abstract Background Breast and prostate cancers are typical examples of hormone-dependent cancers, showing remarkable similarities at the hormone-related signaling pathways level, and exhibiting a high tropism to bone. While the identification of genes playing a specific role in each cancer type bri...

Full description

Bibliographic Details
Main Authors: Marta B. Lopes, Sandra Casimiro, Susana Vinga
Format: Article
Language:English
Published: BMC 2019-06-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-019-2937-8
id doaj-9132430b25c24db58bf75556cab7c4ff
record_format Article
spelling doaj-9132430b25c24db58bf75556cab7c4ff2020-11-25T03:12:43ZengBMCBMC Bioinformatics1471-21052019-06-0120111510.1186/s12859-019-2937-8Twiner: correlation-based regularization for identifying common cancer gene signaturesMarta B. Lopes0Sandra Casimiro1Susana Vinga2Instituto de Telecomunicações, Instituto Superior Técnico, Universidade de LisboaLuis Costa Lab, Instituto de Medicina Molecular, Faculdade de Medicina da Universidade de LisboaINESC-ID, Instituto Superior Técnico, Universidade de LisboaAbstract Background Breast and prostate cancers are typical examples of hormone-dependent cancers, showing remarkable similarities at the hormone-related signaling pathways level, and exhibiting a high tropism to bone. While the identification of genes playing a specific role in each cancer type brings invaluable insights for gene therapy research by targeting disease-specific cell functions not accounted so far, identifying a common gene signature to breast and prostate cancers could unravel new targets to tackle shared hormone-dependent disease features, like bone relapse. This would potentially allow the development of new targeted therapies directed to genes regulating both cancer types, with a consequent positive impact in cancer management and health economics. Results We address the challenge of extracting gene signatures from transcriptomic data of prostate adenocarcinoma (PRAD) and breast invasive carcinoma (BRCA) samples, particularly estrogen positive (ER+), and androgen positive (AR+) triple-negative breast cancer (TNBC), using sparse logistic regression. The introduction of gene network information based on the distances between BRCA and PRAD correlation matrices is investigated, through the proposed twin networks recovery (twiner) penalty, as a strategy to ensure similarly correlated gene features in two diseases to be less penalized during the feature selection procedure. Conclusions Our analysis led to the identification of genes that show a similar correlation pattern in BRCA and PRAD transcriptomic data, and are selected as key players in the classification of breast and prostate samples into ER+ BRCA/AR+ TNBC/PRAD tumor and normal tissues, and also associated with survival time distributions. The results obtained are supported by the literature and are expected to unveil the similarities between the diseases, disclose common disease biomarkers, and help in the definition of new strategies for more effective therapies.http://link.springer.com/article/10.1186/s12859-019-2937-8Gene networkSparse logistic regressionBreast invasive carcinomaTriple-negative breast cancerProstate adenocarcinoma
collection DOAJ
language English
format Article
sources DOAJ
author Marta B. Lopes
Sandra Casimiro
Susana Vinga
spellingShingle Marta B. Lopes
Sandra Casimiro
Susana Vinga
Twiner: correlation-based regularization for identifying common cancer gene signatures
BMC Bioinformatics
Gene network
Sparse logistic regression
Breast invasive carcinoma
Triple-negative breast cancer
Prostate adenocarcinoma
author_facet Marta B. Lopes
Sandra Casimiro
Susana Vinga
author_sort Marta B. Lopes
title Twiner: correlation-based regularization for identifying common cancer gene signatures
title_short Twiner: correlation-based regularization for identifying common cancer gene signatures
title_full Twiner: correlation-based regularization for identifying common cancer gene signatures
title_fullStr Twiner: correlation-based regularization for identifying common cancer gene signatures
title_full_unstemmed Twiner: correlation-based regularization for identifying common cancer gene signatures
title_sort twiner: correlation-based regularization for identifying common cancer gene signatures
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2019-06-01
description Abstract Background Breast and prostate cancers are typical examples of hormone-dependent cancers, showing remarkable similarities at the hormone-related signaling pathways level, and exhibiting a high tropism to bone. While the identification of genes playing a specific role in each cancer type brings invaluable insights for gene therapy research by targeting disease-specific cell functions not accounted so far, identifying a common gene signature to breast and prostate cancers could unravel new targets to tackle shared hormone-dependent disease features, like bone relapse. This would potentially allow the development of new targeted therapies directed to genes regulating both cancer types, with a consequent positive impact in cancer management and health economics. Results We address the challenge of extracting gene signatures from transcriptomic data of prostate adenocarcinoma (PRAD) and breast invasive carcinoma (BRCA) samples, particularly estrogen positive (ER+), and androgen positive (AR+) triple-negative breast cancer (TNBC), using sparse logistic regression. The introduction of gene network information based on the distances between BRCA and PRAD correlation matrices is investigated, through the proposed twin networks recovery (twiner) penalty, as a strategy to ensure similarly correlated gene features in two diseases to be less penalized during the feature selection procedure. Conclusions Our analysis led to the identification of genes that show a similar correlation pattern in BRCA and PRAD transcriptomic data, and are selected as key players in the classification of breast and prostate samples into ER+ BRCA/AR+ TNBC/PRAD tumor and normal tissues, and also associated with survival time distributions. The results obtained are supported by the literature and are expected to unveil the similarities between the diseases, disclose common disease biomarkers, and help in the definition of new strategies for more effective therapies.
topic Gene network
Sparse logistic regression
Breast invasive carcinoma
Triple-negative breast cancer
Prostate adenocarcinoma
url http://link.springer.com/article/10.1186/s12859-019-2937-8
work_keys_str_mv AT martablopes twinercorrelationbasedregularizationforidentifyingcommoncancergenesignatures
AT sandracasimiro twinercorrelationbasedregularizationforidentifyingcommoncancergenesignatures
AT susanavinga twinercorrelationbasedregularizationforidentifyingcommoncancergenesignatures
_version_ 1724648953439322112