A Brain-Inspired Hyperdimensional Computing Approach for Classifying Massive DNA Methylation Data of Cancer
The recent advancements in cancer genomics have put under the spotlight DNA methylation, a genetic modification that regulates the functioning of the genome and whose modifications have an important role in tumorigenesis and tumor-suppression. Because of the high dimensionality and the enormous amou...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-09-01
|
Series: | Algorithms |
Subjects: | |
Online Access: | https://www.mdpi.com/1999-4893/13/9/233 |
id |
doaj-cabeef4dcd5d440089ebc8193066c0ad |
---|---|
record_format |
Article |
spelling |
doaj-cabeef4dcd5d440089ebc8193066c0ad2020-11-25T03:37:38ZengMDPI AGAlgorithms1999-48932020-09-011323323310.3390/a13090233A Brain-Inspired Hyperdimensional Computing Approach for Classifying Massive DNA Methylation Data of CancerFabio Cumbo0Eleonora Cappelli1Emanuel Weitschek2Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, Via Sommarive 9, 38123 Povo Trento, ItalyDepartment of Engineering, University of Roma Tre, Via della Vasca Navale 79/81, 00146 Rome, ItalyDepartment of Engineering, Uninettuno University, Corso Vittorio Emanuele II 39, 00186 Rome, ItalyThe recent advancements in cancer genomics have put under the spotlight DNA methylation, a genetic modification that regulates the functioning of the genome and whose modifications have an important role in tumorigenesis and tumor-suppression. Because of the high dimensionality and the enormous amount of genomic data that are produced through the last advancements in Next Generation Sequencing, it is very challenging to effectively make use of DNA methylation data in diagnostics applications, e.g., in the identification of healthy vs diseased samples. Additionally, state-of-the-art techniques are not fast enough to rapidly produce reliable results or efficient in managing those massive amounts of data. For this reason, we propose HD-classifier, an in-memory cognitive-based hyperdimensional (HD) supervised machine learning algorithm for the classification of tumor vs non tumor samples through the analysis of their DNA Methylation data. The approach takes inspiration from how the human brain is able to remember and distinguish simple and complex concepts by adopting hypervectors and no single numerical values. Exactly as the brain works, this allows for encoding complex patterns, which makes the whole architecture robust to failures and mistakes also with noisy data. We design and develop an algorithm and a software tool that is able to perform supervised classification with the HD approach. We conduct experiments on three DNA methylation datasets of different types of cancer in order to prove the validity of our algorithm, i.e., Breast Invasive Carcinoma (BRCA), Kidney renal papillary cell carcinoma (KIRP), and Thyroid carcinoma (THCA). We obtain outstanding results in terms of accuracy and computational time with a low amount of computational resources. Furthermore, we validate our approach by comparing it (i) to BIGBIOCL, a software based on Random Forest for classifying big omics datasets in distributed computing environments, (ii) to Support Vector Machine (SVM), and (iii) to Decision Tree state-of-the-art classification methods. Finally, we freely release both the datasets and the software on GitHub.https://www.mdpi.com/1999-4893/13/9/233algorithms in biologybioinformaticsmachine learningclassificationhyperdimensional computingcancer |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Fabio Cumbo Eleonora Cappelli Emanuel Weitschek |
spellingShingle |
Fabio Cumbo Eleonora Cappelli Emanuel Weitschek A Brain-Inspired Hyperdimensional Computing Approach for Classifying Massive DNA Methylation Data of Cancer Algorithms algorithms in biology bioinformatics machine learning classification hyperdimensional computing cancer |
author_facet |
Fabio Cumbo Eleonora Cappelli Emanuel Weitschek |
author_sort |
Fabio Cumbo |
title |
A Brain-Inspired Hyperdimensional Computing Approach for Classifying Massive DNA Methylation Data of Cancer |
title_short |
A Brain-Inspired Hyperdimensional Computing Approach for Classifying Massive DNA Methylation Data of Cancer |
title_full |
A Brain-Inspired Hyperdimensional Computing Approach for Classifying Massive DNA Methylation Data of Cancer |
title_fullStr |
A Brain-Inspired Hyperdimensional Computing Approach for Classifying Massive DNA Methylation Data of Cancer |
title_full_unstemmed |
A Brain-Inspired Hyperdimensional Computing Approach for Classifying Massive DNA Methylation Data of Cancer |
title_sort |
brain-inspired hyperdimensional computing approach for classifying massive dna methylation data of cancer |
publisher |
MDPI AG |
series |
Algorithms |
issn |
1999-4893 |
publishDate |
2020-09-01 |
description |
The recent advancements in cancer genomics have put under the spotlight DNA methylation, a genetic modification that regulates the functioning of the genome and whose modifications have an important role in tumorigenesis and tumor-suppression. Because of the high dimensionality and the enormous amount of genomic data that are produced through the last advancements in Next Generation Sequencing, it is very challenging to effectively make use of DNA methylation data in diagnostics applications, e.g., in the identification of healthy vs diseased samples. Additionally, state-of-the-art techniques are not fast enough to rapidly produce reliable results or efficient in managing those massive amounts of data. For this reason, we propose HD-classifier, an in-memory cognitive-based hyperdimensional (HD) supervised machine learning algorithm for the classification of tumor vs non tumor samples through the analysis of their DNA Methylation data. The approach takes inspiration from how the human brain is able to remember and distinguish simple and complex concepts by adopting hypervectors and no single numerical values. Exactly as the brain works, this allows for encoding complex patterns, which makes the whole architecture robust to failures and mistakes also with noisy data. We design and develop an algorithm and a software tool that is able to perform supervised classification with the HD approach. We conduct experiments on three DNA methylation datasets of different types of cancer in order to prove the validity of our algorithm, i.e., Breast Invasive Carcinoma (BRCA), Kidney renal papillary cell carcinoma (KIRP), and Thyroid carcinoma (THCA). We obtain outstanding results in terms of accuracy and computational time with a low amount of computational resources. Furthermore, we validate our approach by comparing it (i) to BIGBIOCL, a software based on Random Forest for classifying big omics datasets in distributed computing environments, (ii) to Support Vector Machine (SVM), and (iii) to Decision Tree state-of-the-art classification methods. Finally, we freely release both the datasets and the software on GitHub. |
topic |
algorithms in biology bioinformatics machine learning classification hyperdimensional computing cancer |
url |
https://www.mdpi.com/1999-4893/13/9/233 |
work_keys_str_mv |
AT fabiocumbo abraininspiredhyperdimensionalcomputingapproachforclassifyingmassivednamethylationdataofcancer AT eleonoracappelli abraininspiredhyperdimensionalcomputingapproachforclassifyingmassivednamethylationdataofcancer AT emanuelweitschek abraininspiredhyperdimensionalcomputingapproachforclassifyingmassivednamethylationdataofcancer AT fabiocumbo braininspiredhyperdimensionalcomputingapproachforclassifyingmassivednamethylationdataofcancer AT eleonoracappelli braininspiredhyperdimensionalcomputingapproachforclassifyingmassivednamethylationdataofcancer AT emanuelweitschek braininspiredhyperdimensionalcomputingapproachforclassifyingmassivednamethylationdataofcancer |
_version_ |
1724544680791638016 |