Machine learning provides an accurate classification of diffuse large b-cell lymphoma from immunohistochemical Data
Background: The classification of diffuse large B-cell lymphomas into Germinal Center (GCB) and non-GC subtypes defines disease subgroups which are different both in terms of gene expression and prognosis. Given their clinical significance, several classification algorithms have been designed, some...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
Wolters Kluwer Medknow Publications
2018-01-01
|
Series: | Journal of Pathology Informatics |
Subjects: | |
Online Access: | http://www.jpathinformatics.org/article.asp?issn=2153-3539;year=2018;volume=9;issue=1;spage=21;epage=21;aulast=Costa |
id |
doaj-5aa7f816ca8c4da7b41e45387fdd6a1a |
---|---|
record_format |
Article |
spelling |
doaj-5aa7f816ca8c4da7b41e45387fdd6a1a2020-11-24T22:03:05ZengWolters Kluwer Medknow PublicationsJournal of Pathology Informatics2153-35392153-35392018-01-0191212110.4103/jpi.jpi_14_18Machine learning provides an accurate classification of diffuse large b-cell lymphoma from immunohistochemical DataCarlos Bruno Tavares Da CostaBackground: The classification of diffuse large B-cell lymphomas into Germinal Center (GCB) and non-GC subtypes defines disease subgroups which are different both in terms of gene expression and prognosis. Given their clinical significance, several classification algorithms have been designed, some by making use of widely availability immunohistochemical techniques. Despite their high concordance with gene expression profiles (GEP) and prognostic value, these algorithms were based on technical and biological assumptions that could be improved in terms of performance for classification. Methods: In order to overcome this limitation, a new algorithm was obtained by analyzing a previously published dataset of 475 patients by using an automatic classification tree method. Results: The resulting algorithm classifies correctly 91.6% of the cases when compared to GEP, displaying a Receiver-Operator Characteristic (ROC) area under the curve of 0.934. Noteworthy features of this algorithm include the capability to classify GEP-unclassifiable cases and a significant prognostic value, both in terms of overall survival (60 months for non-GC vs not reached for GCB, P = 0.007) and progression-free survival (61.9 months vs not reached, P = 0.017). Conclusion: By using a machine learning classification method that avoids most pre-assumptions, the novel algorithm obtained is accurate and maintains relevant features for clinical implementation.http://www.jpathinformatics.org/article.asp?issn=2153-3539;year=2018;volume=9;issue=1;spage=21;epage=21;aulast=CostaCell of originimmunohistochemistrylymphomamachine learningprognostic |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Carlos Bruno Tavares Da Costa |
spellingShingle |
Carlos Bruno Tavares Da Costa Machine learning provides an accurate classification of diffuse large b-cell lymphoma from immunohistochemical Data Journal of Pathology Informatics Cell of origin immunohistochemistry lymphoma machine learning prognostic |
author_facet |
Carlos Bruno Tavares Da Costa |
author_sort |
Carlos Bruno Tavares Da Costa |
title |
Machine learning provides an accurate classification of diffuse large b-cell lymphoma from immunohistochemical Data |
title_short |
Machine learning provides an accurate classification of diffuse large b-cell lymphoma from immunohistochemical Data |
title_full |
Machine learning provides an accurate classification of diffuse large b-cell lymphoma from immunohistochemical Data |
title_fullStr |
Machine learning provides an accurate classification of diffuse large b-cell lymphoma from immunohistochemical Data |
title_full_unstemmed |
Machine learning provides an accurate classification of diffuse large b-cell lymphoma from immunohistochemical Data |
title_sort |
machine learning provides an accurate classification of diffuse large b-cell lymphoma from immunohistochemical data |
publisher |
Wolters Kluwer Medknow Publications |
series |
Journal of Pathology Informatics |
issn |
2153-3539 2153-3539 |
publishDate |
2018-01-01 |
description |
Background: The classification of diffuse large B-cell lymphomas into Germinal Center (GCB) and non-GC subtypes defines disease subgroups which are different both in terms of gene expression and prognosis. Given their clinical significance, several classification algorithms have been designed, some by making use of widely availability immunohistochemical techniques. Despite their high concordance with gene expression profiles (GEP) and prognostic value, these algorithms were based on technical and biological assumptions that could be improved in terms of performance for classification. Methods: In order to overcome this limitation, a new algorithm was obtained by analyzing a previously published dataset of 475 patients by using an automatic classification tree method. Results: The resulting algorithm classifies correctly 91.6% of the cases when compared to GEP, displaying a Receiver-Operator Characteristic (ROC) area under the curve of 0.934. Noteworthy features of this algorithm include the capability to classify GEP-unclassifiable cases and a significant prognostic value, both in terms of overall survival (60 months for non-GC vs not reached for GCB, P = 0.007) and progression-free survival (61.9 months vs not reached, P = 0.017). Conclusion: By using a machine learning classification method that avoids most pre-assumptions, the novel algorithm obtained is accurate and maintains relevant features for clinical implementation. |
topic |
Cell of origin immunohistochemistry lymphoma machine learning prognostic |
url |
http://www.jpathinformatics.org/article.asp?issn=2153-3539;year=2018;volume=9;issue=1;spage=21;epage=21;aulast=Costa |
work_keys_str_mv |
AT carlosbrunotavaresdacosta machinelearningprovidesanaccurateclassificationofdiffuselargebcelllymphomafromimmunohistochemicaldata |
_version_ |
1725833271625908224 |