Predictability of drug-induced liver injury by machine learning

Abstract Background Drug-induced liver injury (DILI) is a major concern in drug development, as hepatotoxicity may not be apparent at early stages but can lead to life threatening consequences. The ability to predict DILI from in vitro data would be a crucial advantage. In 2018, the Critical Assessm...

Full description

Bibliographic Details
Main Authors:	Marco Chierici, Margherita Francescatto, Nicole Bussola, Giuseppe Jurman, Cesare Furlanello
Format:	Article
Language:	English
Published:	BMC 2020-02-01
Series:	Biology Direct
Subjects:	Deep learning DILI Classification Microarray CMap
Online Access:	https://doi.org/10.1186/s13062-020-0259-4

id	doaj-34625fc564874e1d81b2a2950d47755d
record_format	Article
spelling	doaj-34625fc564874e1d81b2a2950d47755d2021-02-14T12:23:11ZengBMCBiology Direct1745-61502020-02-0115111010.1186/s13062-020-0259-4Predictability of drug-induced liver injury by machine learningMarco Chierici0Margherita Francescatto1Nicole Bussola2Giuseppe Jurman3Cesare Furlanello4Fondazione Bruno KesslerFondazione Bruno KesslerFondazione Bruno KesslerFondazione Bruno KesslerFondazione Bruno KesslerAbstract Background Drug-induced liver injury (DILI) is a major concern in drug development, as hepatotoxicity may not be apparent at early stages but can lead to life threatening consequences. The ability to predict DILI from in vitro data would be a crucial advantage. In 2018, the Critical Assessment Massive Data Analysis group proposed the CMap Drug Safety challenge focusing on DILI prediction. Methods and results The challenge data included Affymetrix GeneChip expression profiles for the two cancer cell lines MCF7 and PC3 treated with 276 drug compounds and empty vehicles. Binary DILI labeling and a recommended train/test split for the development of predictive classification approaches were also provided. We devised three deep learning architectures for DILI prediction on the challenge data and compared them to random forest and multi-layer perceptron classifiers. On a subset of the data and for some of the models we additionally tested several strategies for balancing the two DILI classes and to identify alternative informative train/test splits. All the models were trained with the MAQC data analysis protocol (DAP), i.e., 10x5 cross-validation over the training set. In all the experiments, the classification performance in both cross-validation and external validation gave Matthews correlation coefficient (MCC) values below 0.2. We observed minimal differences between the two cell lines. Notably, deep learning approaches did not give an advantage on the classification performance. Discussion We extensively tested multiple machine learning approaches for the DILI classification task obtaining poor to mediocre performance. The results suggest that the CMap expression data on the two cell lines MCF7 and PC3 are not sufficient for accurate DILI label prediction. Reviewers This article was reviewed by Maciej Kandula and Paweł P. Labaj.https://doi.org/10.1186/s13062-020-0259-4Deep learningDILIClassificationMicroarrayCMap
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Marco Chierici Margherita Francescatto Nicole Bussola Giuseppe Jurman Cesare Furlanello
spellingShingle	Marco Chierici Margherita Francescatto Nicole Bussola Giuseppe Jurman Cesare Furlanello Predictability of drug-induced liver injury by machine learning Biology Direct Deep learning DILI Classification Microarray CMap
author_facet	Marco Chierici Margherita Francescatto Nicole Bussola Giuseppe Jurman Cesare Furlanello
author_sort	Marco Chierici
title	Predictability of drug-induced liver injury by machine learning
title_short	Predictability of drug-induced liver injury by machine learning
title_full	Predictability of drug-induced liver injury by machine learning
title_fullStr	Predictability of drug-induced liver injury by machine learning
title_full_unstemmed	Predictability of drug-induced liver injury by machine learning
title_sort	predictability of drug-induced liver injury by machine learning
publisher	BMC
series	Biology Direct
issn	1745-6150
publishDate	2020-02-01
description	Abstract Background Drug-induced liver injury (DILI) is a major concern in drug development, as hepatotoxicity may not be apparent at early stages but can lead to life threatening consequences. The ability to predict DILI from in vitro data would be a crucial advantage. In 2018, the Critical Assessment Massive Data Analysis group proposed the CMap Drug Safety challenge focusing on DILI prediction. Methods and results The challenge data included Affymetrix GeneChip expression profiles for the two cancer cell lines MCF7 and PC3 treated with 276 drug compounds and empty vehicles. Binary DILI labeling and a recommended train/test split for the development of predictive classification approaches were also provided. We devised three deep learning architectures for DILI prediction on the challenge data and compared them to random forest and multi-layer perceptron classifiers. On a subset of the data and for some of the models we additionally tested several strategies for balancing the two DILI classes and to identify alternative informative train/test splits. All the models were trained with the MAQC data analysis protocol (DAP), i.e., 10x5 cross-validation over the training set. In all the experiments, the classification performance in both cross-validation and external validation gave Matthews correlation coefficient (MCC) values below 0.2. We observed minimal differences between the two cell lines. Notably, deep learning approaches did not give an advantage on the classification performance. Discussion We extensively tested multiple machine learning approaches for the DILI classification task obtaining poor to mediocre performance. The results suggest that the CMap expression data on the two cell lines MCF7 and PC3 are not sufficient for accurate DILI label prediction. Reviewers This article was reviewed by Maciej Kandula and Paweł P. Labaj.
topic	Deep learning DILI Classification Microarray CMap
url	https://doi.org/10.1186/s13062-020-0259-4
work_keys_str_mv	AT marcochierici predictabilityofdruginducedliverinjurybymachinelearning AT margheritafrancescatto predictabilityofdruginducedliverinjurybymachinelearning AT nicolebussola predictabilityofdruginducedliverinjurybymachinelearning AT giuseppejurman predictabilityofdruginducedliverinjurybymachinelearning AT cesarefurlanello predictabilityofdruginducedliverinjurybymachinelearning
_version_	1724270503999307776

Predictability of drug-induced liver injury by machine learning

Similar Items