Design of Probabilistic Random Forests with Applications to Anticancer Drug Sensitivity Prediction

Random forests consisting of an ensemble of regression trees with equal weights are frequently used for design of predictive models. In this article, we consider an extension of the methodology by representing the regression trees in the form of probabilistic trees and analyzing the nature of hetero...

Full description

Bibliographic Details
Main Authors: Raziur Rahman, Saad Haider, Souparno Ghosh, Ranadip Pal
Format: Article
Language:English
Published: SAGE Publishing 2015-01-01
Series:Cancer Informatics
Online Access:https://doi.org/10.4137/CIN.S30794
id doaj-8ed6788ce9e74644813498f605b65740
record_format Article
spelling doaj-8ed6788ce9e74644813498f605b657402020-11-25T03:32:22ZengSAGE PublishingCancer Informatics1176-93512015-01-0114s510.4137/CIN.S30794Design of Probabilistic Random Forests with Applications to Anticancer Drug Sensitivity PredictionRaziur Rahman0Saad Haider1Souparno Ghosh2Ranadip Pal3Department of Electrical and Computer Engineering, Texas Tech University, Lubbock, TX, USA.Department of Electrical and Computer Engineering, Texas Tech University, Lubbock, TX, USA.Department of Mathematics and Statistics, Texas Tech University, Lubbock, TX, USA.Department of Electrical and Computer Engineering, Texas Tech University, Lubbock, TX, USA.Random forests consisting of an ensemble of regression trees with equal weights are frequently used for design of predictive models. In this article, we consider an extension of the methodology by representing the regression trees in the form of probabilistic trees and analyzing the nature of heteroscedasticity. The probabilistic tree representation allows for analytical computation of confidence intervals (CIs), and the tree weight optimization is expected to provide stricter CIs with comparable performance in mean error. We approached the ensemble of probabilistic trees’ prediction from the perspectives of a mixture distribution and as a weighted sum of correlated random variables. We applied our methodology to the drug sensitivity prediction problem on synthetic and cancer cell line encyclopedia dataset and illustrated that tree weights can be selected to reduce the average length of the CI without increase in mean error.https://doi.org/10.4137/CIN.S30794
collection DOAJ
language English
format Article
sources DOAJ
author Raziur Rahman
Saad Haider
Souparno Ghosh
Ranadip Pal
spellingShingle Raziur Rahman
Saad Haider
Souparno Ghosh
Ranadip Pal
Design of Probabilistic Random Forests with Applications to Anticancer Drug Sensitivity Prediction
Cancer Informatics
author_facet Raziur Rahman
Saad Haider
Souparno Ghosh
Ranadip Pal
author_sort Raziur Rahman
title Design of Probabilistic Random Forests with Applications to Anticancer Drug Sensitivity Prediction
title_short Design of Probabilistic Random Forests with Applications to Anticancer Drug Sensitivity Prediction
title_full Design of Probabilistic Random Forests with Applications to Anticancer Drug Sensitivity Prediction
title_fullStr Design of Probabilistic Random Forests with Applications to Anticancer Drug Sensitivity Prediction
title_full_unstemmed Design of Probabilistic Random Forests with Applications to Anticancer Drug Sensitivity Prediction
title_sort design of probabilistic random forests with applications to anticancer drug sensitivity prediction
publisher SAGE Publishing
series Cancer Informatics
issn 1176-9351
publishDate 2015-01-01
description Random forests consisting of an ensemble of regression trees with equal weights are frequently used for design of predictive models. In this article, we consider an extension of the methodology by representing the regression trees in the form of probabilistic trees and analyzing the nature of heteroscedasticity. The probabilistic tree representation allows for analytical computation of confidence intervals (CIs), and the tree weight optimization is expected to provide stricter CIs with comparable performance in mean error. We approached the ensemble of probabilistic trees’ prediction from the perspectives of a mixture distribution and as a weighted sum of correlated random variables. We applied our methodology to the drug sensitivity prediction problem on synthetic and cancer cell line encyclopedia dataset and illustrated that tree weights can be selected to reduce the average length of the CI without increase in mean error.
url https://doi.org/10.4137/CIN.S30794
work_keys_str_mv AT raziurrahman designofprobabilisticrandomforestswithapplicationstoanticancerdrugsensitivityprediction
AT saadhaider designofprobabilisticrandomforestswithapplicationstoanticancerdrugsensitivityprediction
AT souparnoghosh designofprobabilisticrandomforestswithapplicationstoanticancerdrugsensitivityprediction
AT ranadippal designofprobabilisticrandomforestswithapplicationstoanticancerdrugsensitivityprediction
_version_ 1724568802940682240