Machine learning provides an accurate classification of diffuse large b-cell lymphoma from immunohistochemical Data

Background: The classification of diffuse large B-cell lymphomas into Germinal Center (GCB) and non-GC subtypes defines disease subgroups which are different both in terms of gene expression and prognosis. Given their clinical significance, several classification algorithms have been designed, some...

Full description

Bibliographic Details
Main Author: Carlos Bruno Tavares Da Costa
Format: Article
Language:English
Published: Wolters Kluwer Medknow Publications 2018-01-01
Series:Journal of Pathology Informatics
Subjects:
Online Access:http://www.jpathinformatics.org/article.asp?issn=2153-3539;year=2018;volume=9;issue=1;spage=21;epage=21;aulast=Costa
id doaj-5aa7f816ca8c4da7b41e45387fdd6a1a
record_format Article
spelling doaj-5aa7f816ca8c4da7b41e45387fdd6a1a2020-11-24T22:03:05ZengWolters Kluwer Medknow PublicationsJournal of Pathology Informatics2153-35392153-35392018-01-0191212110.4103/jpi.jpi_14_18Machine learning provides an accurate classification of diffuse large b-cell lymphoma from immunohistochemical DataCarlos Bruno Tavares Da CostaBackground: The classification of diffuse large B-cell lymphomas into Germinal Center (GCB) and non-GC subtypes defines disease subgroups which are different both in terms of gene expression and prognosis. Given their clinical significance, several classification algorithms have been designed, some by making use of widely availability immunohistochemical techniques. Despite their high concordance with gene expression profiles (GEP) and prognostic value, these algorithms were based on technical and biological assumptions that could be improved in terms of performance for classification. Methods: In order to overcome this limitation, a new algorithm was obtained by analyzing a previously published dataset of 475 patients by using an automatic classification tree method. Results: The resulting algorithm classifies correctly 91.6% of the cases when compared to GEP, displaying a Receiver-Operator Characteristic (ROC) area under the curve of 0.934. Noteworthy features of this algorithm include the capability to classify GEP-unclassifiable cases and a significant prognostic value, both in terms of overall survival (60 months for non-GC vs not reached for GCB, P = 0.007) and progression-free survival (61.9 months vs not reached, P = 0.017). Conclusion: By using a machine learning classification method that avoids most pre-assumptions, the novel algorithm obtained is accurate and maintains relevant features for clinical implementation.http://www.jpathinformatics.org/article.asp?issn=2153-3539;year=2018;volume=9;issue=1;spage=21;epage=21;aulast=CostaCell of originimmunohistochemistrylymphomamachine learningprognostic
collection DOAJ
language English
format Article
sources DOAJ
author Carlos Bruno Tavares Da Costa
spellingShingle Carlos Bruno Tavares Da Costa
Machine learning provides an accurate classification of diffuse large b-cell lymphoma from immunohistochemical Data
Journal of Pathology Informatics
Cell of origin
immunohistochemistry
lymphoma
machine learning
prognostic
author_facet Carlos Bruno Tavares Da Costa
author_sort Carlos Bruno Tavares Da Costa
title Machine learning provides an accurate classification of diffuse large b-cell lymphoma from immunohistochemical Data
title_short Machine learning provides an accurate classification of diffuse large b-cell lymphoma from immunohistochemical Data
title_full Machine learning provides an accurate classification of diffuse large b-cell lymphoma from immunohistochemical Data
title_fullStr Machine learning provides an accurate classification of diffuse large b-cell lymphoma from immunohistochemical Data
title_full_unstemmed Machine learning provides an accurate classification of diffuse large b-cell lymphoma from immunohistochemical Data
title_sort machine learning provides an accurate classification of diffuse large b-cell lymphoma from immunohistochemical data
publisher Wolters Kluwer Medknow Publications
series Journal of Pathology Informatics
issn 2153-3539
2153-3539
publishDate 2018-01-01
description Background: The classification of diffuse large B-cell lymphomas into Germinal Center (GCB) and non-GC subtypes defines disease subgroups which are different both in terms of gene expression and prognosis. Given their clinical significance, several classification algorithms have been designed, some by making use of widely availability immunohistochemical techniques. Despite their high concordance with gene expression profiles (GEP) and prognostic value, these algorithms were based on technical and biological assumptions that could be improved in terms of performance for classification. Methods: In order to overcome this limitation, a new algorithm was obtained by analyzing a previously published dataset of 475 patients by using an automatic classification tree method. Results: The resulting algorithm classifies correctly 91.6% of the cases when compared to GEP, displaying a Receiver-Operator Characteristic (ROC) area under the curve of 0.934. Noteworthy features of this algorithm include the capability to classify GEP-unclassifiable cases and a significant prognostic value, both in terms of overall survival (60 months for non-GC vs not reached for GCB, P = 0.007) and progression-free survival (61.9 months vs not reached, P = 0.017). Conclusion: By using a machine learning classification method that avoids most pre-assumptions, the novel algorithm obtained is accurate and maintains relevant features for clinical implementation.
topic Cell of origin
immunohistochemistry
lymphoma
machine learning
prognostic
url http://www.jpathinformatics.org/article.asp?issn=2153-3539;year=2018;volume=9;issue=1;spage=21;epage=21;aulast=Costa
work_keys_str_mv AT carlosbrunotavaresdacosta machinelearningprovidesanaccurateclassificationofdiffuselargebcelllymphomafromimmunohistochemicaldata
_version_ 1725833271625908224