Addressing the Variable Selection Bias and Local Optimum Limitations of Longitudinal Recursive Partitioning with Time-Efficient Approximations

abstract: Longitudinal recursive partitioning (LRP) is a tree-based method for longitudinal data. It takes a sample of individuals that were each measured repeatedly across time, and it splits them based on a set of covariates such that individuals with similar trajectories become grouped together i...

Full description

Bibliographic Details
Other Authors: Stegmann, Gabriela (Author)
Format: Doctoral Thesis
Language:English
Published: 2019
Subjects:
Online Access:http://hdl.handle.net/2286/R.I.54792
id ndltd-asu.edu-item-54792
record_format oai_dc
spelling ndltd-asu.edu-item-547922019-11-07T03:00:58Z Addressing the Variable Selection Bias and Local Optimum Limitations of Longitudinal Recursive Partitioning with Time-Efficient Approximations abstract: Longitudinal recursive partitioning (LRP) is a tree-based method for longitudinal data. It takes a sample of individuals that were each measured repeatedly across time, and it splits them based on a set of covariates such that individuals with similar trajectories become grouped together into nodes. LRP does this by fitting a mixed-effects model to each node every time that it becomes partitioned and extracting the deviance, which is the measure of node purity. LRP is implemented using the classification and regression tree algorithm, which suffers from a variable selection bias and does not guarantee reaching a global optimum. Additionally, fitting mixed-effects models to each potential split only to extract the deviance and discard the rest of the information is a computationally intensive procedure. Therefore, in this dissertation, I address the high computational demand, variable selection bias, and local optimum solution. I propose three approximation methods that reduce the computational demand of LRP, and at the same time, allow for a straightforward extension to recursive partitioning algorithms that do not have a variable selection bias and can reach the global optimum solution. In the three proposed approximations, a mixed-effects model is fit to the full data, and the growth curve coefficients for each individual are extracted. Then, (1) a principal component analysis is fit to the set of coefficients and the principal component score is extracted for each individual, (2) a one-factor model is fit to the coefficients and the factor score is extracted, or (3) the coefficients are summed. The three methods result in each individual having a single score that represents the growth curve trajectory. Therefore, now that the outcome is a single score for each individual, any tree-based method may be used for partitioning the data and group the individuals together. Once the individuals are assigned to their final nodes, a mixed-effects model is fit to each terminal node with the individuals belonging to it. I conduct a simulation study, where I show that the approximation methods achieve the goals proposed while maintaining a similar level of out-of-sample prediction accuracy as LRP. I then illustrate and compare the methods using an applied data. Dissertation/Thesis Stegmann, Gabriela (Author) Grimm, Kevin (Advisor) Edwards, Michael (Committee member) MacKinnon, David (Committee member) McNeish, Daniel (Committee member) Arizona State University (Publisher) Quantitative psychology Growth Curve Model Longitudinal Data Machine Learning Mixed-Effects Models Recursive Partitioning Regression Trees eng 102 pages Doctoral Dissertation Psychology 2019 Doctoral Dissertation http://hdl.handle.net/2286/R.I.54792 http://rightsstatements.org/vocab/InC/1.0/ 2019
collection NDLTD
language English
format Doctoral Thesis
sources NDLTD
topic Quantitative psychology
Growth Curve Model
Longitudinal Data
Machine Learning
Mixed-Effects Models
Recursive Partitioning
Regression Trees
spellingShingle Quantitative psychology
Growth Curve Model
Longitudinal Data
Machine Learning
Mixed-Effects Models
Recursive Partitioning
Regression Trees
Addressing the Variable Selection Bias and Local Optimum Limitations of Longitudinal Recursive Partitioning with Time-Efficient Approximations
description abstract: Longitudinal recursive partitioning (LRP) is a tree-based method for longitudinal data. It takes a sample of individuals that were each measured repeatedly across time, and it splits them based on a set of covariates such that individuals with similar trajectories become grouped together into nodes. LRP does this by fitting a mixed-effects model to each node every time that it becomes partitioned and extracting the deviance, which is the measure of node purity. LRP is implemented using the classification and regression tree algorithm, which suffers from a variable selection bias and does not guarantee reaching a global optimum. Additionally, fitting mixed-effects models to each potential split only to extract the deviance and discard the rest of the information is a computationally intensive procedure. Therefore, in this dissertation, I address the high computational demand, variable selection bias, and local optimum solution. I propose three approximation methods that reduce the computational demand of LRP, and at the same time, allow for a straightforward extension to recursive partitioning algorithms that do not have a variable selection bias and can reach the global optimum solution. In the three proposed approximations, a mixed-effects model is fit to the full data, and the growth curve coefficients for each individual are extracted. Then, (1) a principal component analysis is fit to the set of coefficients and the principal component score is extracted for each individual, (2) a one-factor model is fit to the coefficients and the factor score is extracted, or (3) the coefficients are summed. The three methods result in each individual having a single score that represents the growth curve trajectory. Therefore, now that the outcome is a single score for each individual, any tree-based method may be used for partitioning the data and group the individuals together. Once the individuals are assigned to their final nodes, a mixed-effects model is fit to each terminal node with the individuals belonging to it. I conduct a simulation study, where I show that the approximation methods achieve the goals proposed while maintaining a similar level of out-of-sample prediction accuracy as LRP. I then illustrate and compare the methods using an applied data. === Dissertation/Thesis === Doctoral Dissertation Psychology 2019
author2 Stegmann, Gabriela (Author)
author_facet Stegmann, Gabriela (Author)
title Addressing the Variable Selection Bias and Local Optimum Limitations of Longitudinal Recursive Partitioning with Time-Efficient Approximations
title_short Addressing the Variable Selection Bias and Local Optimum Limitations of Longitudinal Recursive Partitioning with Time-Efficient Approximations
title_full Addressing the Variable Selection Bias and Local Optimum Limitations of Longitudinal Recursive Partitioning with Time-Efficient Approximations
title_fullStr Addressing the Variable Selection Bias and Local Optimum Limitations of Longitudinal Recursive Partitioning with Time-Efficient Approximations
title_full_unstemmed Addressing the Variable Selection Bias and Local Optimum Limitations of Longitudinal Recursive Partitioning with Time-Efficient Approximations
title_sort addressing the variable selection bias and local optimum limitations of longitudinal recursive partitioning with time-efficient approximations
publishDate 2019
url http://hdl.handle.net/2286/R.I.54792
_version_ 1719287531264016384