Improved computational methods for Bayesian tree models

Trees have long been used as a flexible way to build regression and classification models for complex problems. They can accommodate nonlinear response-predictor relationships and even interactive intra-predictor relationships. Tree based models handle data sets with predictors of mixed types, both...

Full description

Bibliographic Details
Main Author:	Zhao, Yue
Language:	ENG
Published:	ScholarWorks@UMass Amherst 2012
Subjects:	Statistics
Online Access:	https://scholarworks.umass.edu/dissertations/AAI3546000

id	ndltd-UMASS-oai-scholarworks.umass.edu-dissertations-6761
record_format	oai_dc
spelling	ndltd-UMASS-oai-scholarworks.umass.edu-dissertations-67612020-12-02T14:37:26Z Improved computational methods for Bayesian tree models Zhao, Yue Trees have long been used as a flexible way to build regression and classification models for complex problems. They can accommodate nonlinear response-predictor relationships and even interactive intra-predictor relationships. Tree based models handle data sets with predictors of mixed types, both ordered and categorical, in a natural way. The tree based regression model can also be used as the base model to build additive models, among which the most prominent models are gradient boosting trees and random forests. Classical training algorithms for tree based models are deterministic greedy algorithms. These algorithms are fast to train, but they usually are not guaranteed to find an optimal tree. In this paper, we discuss a Bayesian approach to building tree based models. In Bayesian tree models, each tree is assigned a prior probability based on its structure, and standard Monte Carlo Markov Chain (MCMC) algorithms can be used to search through the posterior distribution. This thesis is aimed at improving the computational efficiency and performance of Bayesian tree based models. We consider introducing new proposal or “moves” in the MCMC algorithm to improve the efficiency of the algorithm. We use temperature based algorithms to help the MCMC algorithm get out of local optima and move towards the global optimum in the posterior distribution. Moreover, we develop semi-parametric Bayesian additive trees models where some predictors enter the model parametrically. The technical details about using parallel computing to shorten the computing time are also discussed in this thesis. 2012-01-01T08:00:00Z text https://scholarworks.umass.edu/dissertations/AAI3546000 Doctoral Dissertations Available from Proquest ENG ScholarWorks@UMass Amherst Statistics
collection	NDLTD
language	ENG
sources	NDLTD
topic	Statistics
spellingShingle	Statistics Zhao, Yue Improved computational methods for Bayesian tree models
description	Trees have long been used as a flexible way to build regression and classification models for complex problems. They can accommodate nonlinear response-predictor relationships and even interactive intra-predictor relationships. Tree based models handle data sets with predictors of mixed types, both ordered and categorical, in a natural way. The tree based regression model can also be used as the base model to build additive models, among which the most prominent models are gradient boosting trees and random forests. Classical training algorithms for tree based models are deterministic greedy algorithms. These algorithms are fast to train, but they usually are not guaranteed to find an optimal tree. In this paper, we discuss a Bayesian approach to building tree based models. In Bayesian tree models, each tree is assigned a prior probability based on its structure, and standard Monte Carlo Markov Chain (MCMC) algorithms can be used to search through the posterior distribution. This thesis is aimed at improving the computational efficiency and performance of Bayesian tree based models. We consider introducing new proposal or “moves” in the MCMC algorithm to improve the efficiency of the algorithm. We use temperature based algorithms to help the MCMC algorithm get out of local optima and move towards the global optimum in the posterior distribution. Moreover, we develop semi-parametric Bayesian additive trees models where some predictors enter the model parametrically. The technical details about using parallel computing to shorten the computing time are also discussed in this thesis.
author	Zhao, Yue
author_facet	Zhao, Yue
author_sort	Zhao, Yue
title	Improved computational methods for Bayesian tree models
title_short	Improved computational methods for Bayesian tree models
title_full	Improved computational methods for Bayesian tree models
title_fullStr	Improved computational methods for Bayesian tree models
title_full_unstemmed	Improved computational methods for Bayesian tree models
title_sort	improved computational methods for bayesian tree models
publisher	ScholarWorks@UMass Amherst
publishDate	2012
url	https://scholarworks.umass.edu/dissertations/AAI3546000
work_keys_str_mv	AT zhaoyue improvedcomputationalmethodsforbayesiantreemodels
_version_	1719365605917720576

Improved computational methods for Bayesian tree models

Similar Items