A comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data

Abstract Background Random survival forest (RSF) models have been identified as alternative methods to the Cox proportional hazards model in analysing time-to-event data. These methods, however, have been criticised for the bias that results from favouring covariates with many split-points and hence...

Full description

Bibliographic Details
Main Authors: Justine B. Nasejje, Henry Mwambi, Keertan Dheda, Maia Lesosky
Format: Article
Language:English
Published: BMC 2017-07-01
Series:BMC Medical Research Methodology
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12874-017-0383-8
id doaj-a629158c4fed48a9a6f86b878ef84013
record_format Article
spelling doaj-a629158c4fed48a9a6f86b878ef840132020-11-25T00:41:05ZengBMCBMC Medical Research Methodology1471-22882017-07-0117111710.1186/s12874-017-0383-8A comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event dataJustine B. Nasejje0Henry Mwambi1Keertan Dheda2Maia Lesosky3School of Statistics, Mathematics and Computer Science, University of Kwazulu-NatalSchool of Statistics, Mathematics and Computer Science, University of Kwazulu-NatalDivision of Pulmonology and UCT Lung Institute, Department of Medicine, University of Cape TownDivision of Epidemiology and Biostatistics, School of Public Health and Family Medicine, University of Cape TownAbstract Background Random survival forest (RSF) models have been identified as alternative methods to the Cox proportional hazards model in analysing time-to-event data. These methods, however, have been criticised for the bias that results from favouring covariates with many split-points and hence conditional inference forests for time-to-event data have been suggested. Conditional inference forests (CIF) are known to correct the bias in RSF models by separating the procedure for the best covariate to split on from that of the best split point search for the selected covariate. Methods In this study, we compare the random survival forest model to the conditional inference model (CIF) using twenty-two simulated time-to-event datasets. We also analysed two real time-to-event datasets. The first dataset is based on the survival of children under-five years of age in Uganda and it consists of categorical covariates with most of them having more than two levels (many split-points). The second dataset is based on the survival of patients with extremely drug resistant tuberculosis (XDR TB) which consists of mainly categorical covariates with two levels (few split-points). Results The study findings indicate that the conditional inference forest model is superior to random survival forest models in analysing time-to-event data that consists of covariates with many split-points based on the values of the bootstrap cross-validated estimates for integrated Brier scores. However, conditional inference forests perform comparably similar to random survival forests models in analysing time-to-event data consisting of covariates with fewer split-points. Conclusion Although survival forests are promising methods in analysing time-to-event data, it is important to identify the best forest model for analysis based on the nature of covariates of the dataset in question.http://link.springer.com/article/10.1186/s12874-017-0383-8Survival analysisSplit-pointsSurvival treesRandom survival forestsConditional inference forests
collection DOAJ
language English
format Article
sources DOAJ
author Justine B. Nasejje
Henry Mwambi
Keertan Dheda
Maia Lesosky
spellingShingle Justine B. Nasejje
Henry Mwambi
Keertan Dheda
Maia Lesosky
A comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data
BMC Medical Research Methodology
Survival analysis
Split-points
Survival trees
Random survival forests
Conditional inference forests
author_facet Justine B. Nasejje
Henry Mwambi
Keertan Dheda
Maia Lesosky
author_sort Justine B. Nasejje
title A comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data
title_short A comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data
title_full A comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data
title_fullStr A comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data
title_full_unstemmed A comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data
title_sort comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data
publisher BMC
series BMC Medical Research Methodology
issn 1471-2288
publishDate 2017-07-01
description Abstract Background Random survival forest (RSF) models have been identified as alternative methods to the Cox proportional hazards model in analysing time-to-event data. These methods, however, have been criticised for the bias that results from favouring covariates with many split-points and hence conditional inference forests for time-to-event data have been suggested. Conditional inference forests (CIF) are known to correct the bias in RSF models by separating the procedure for the best covariate to split on from that of the best split point search for the selected covariate. Methods In this study, we compare the random survival forest model to the conditional inference model (CIF) using twenty-two simulated time-to-event datasets. We also analysed two real time-to-event datasets. The first dataset is based on the survival of children under-five years of age in Uganda and it consists of categorical covariates with most of them having more than two levels (many split-points). The second dataset is based on the survival of patients with extremely drug resistant tuberculosis (XDR TB) which consists of mainly categorical covariates with two levels (few split-points). Results The study findings indicate that the conditional inference forest model is superior to random survival forest models in analysing time-to-event data that consists of covariates with many split-points based on the values of the bootstrap cross-validated estimates for integrated Brier scores. However, conditional inference forests perform comparably similar to random survival forests models in analysing time-to-event data consisting of covariates with fewer split-points. Conclusion Although survival forests are promising methods in analysing time-to-event data, it is important to identify the best forest model for analysis based on the nature of covariates of the dataset in question.
topic Survival analysis
Split-points
Survival trees
Random survival forests
Conditional inference forests
url http://link.springer.com/article/10.1186/s12874-017-0383-8
work_keys_str_mv AT justinebnasejje acomparisonoftheconditionalinferencesurvivalforestmodeltorandomsurvivalforestsbasedonasimulationstudyaswellasontwoapplicationswithtimetoeventdata
AT henrymwambi acomparisonoftheconditionalinferencesurvivalforestmodeltorandomsurvivalforestsbasedonasimulationstudyaswellasontwoapplicationswithtimetoeventdata
AT keertandheda acomparisonoftheconditionalinferencesurvivalforestmodeltorandomsurvivalforestsbasedonasimulationstudyaswellasontwoapplicationswithtimetoeventdata
AT maialesosky acomparisonoftheconditionalinferencesurvivalforestmodeltorandomsurvivalforestsbasedonasimulationstudyaswellasontwoapplicationswithtimetoeventdata
AT justinebnasejje comparisonoftheconditionalinferencesurvivalforestmodeltorandomsurvivalforestsbasedonasimulationstudyaswellasontwoapplicationswithtimetoeventdata
AT henrymwambi comparisonoftheconditionalinferencesurvivalforestmodeltorandomsurvivalforestsbasedonasimulationstudyaswellasontwoapplicationswithtimetoeventdata
AT keertandheda comparisonoftheconditionalinferencesurvivalforestmodeltorandomsurvivalforestsbasedonasimulationstudyaswellasontwoapplicationswithtimetoeventdata
AT maialesosky comparisonoftheconditionalinferencesurvivalforestmodeltorandomsurvivalforestsbasedonasimulationstudyaswellasontwoapplicationswithtimetoeventdata
_version_ 1725287239388233728