Analytic for data-driven decision-making in complex high-dimensional time-to-event data

In the era of big data, analysis of complex and huge data expends time and money, may cause errors and misinterpretations. Consequently, inaccurate and erroneous reasoning could lead to poor inference and decision-making, sometimes irreversible and catastrophic events. On the other hand, proper mana...

Full description

Bibliographic Details
Published:
Online Access:http://hdl.handle.net/2047/D20195338
id ndltd-NEU--neu-rx917b36b
record_format oai_dc
spelling ndltd-NEU--neu-rx917b36b2021-05-28T05:22:20ZAnalytic for data-driven decision-making in complex high-dimensional time-to-event dataIn the era of big data, analysis of complex and huge data expends time and money, may cause errors and misinterpretations. Consequently, inaccurate and erroneous reasoning could lead to poor inference and decision-making, sometimes irreversible and catastrophic events. On the other hand, proper management and utilization of valuable data could significantly increase knowledge and reduce cost by preventive actions. In many areas, there are great interests in time and causes of events. Time-to-event data analysis is a kernel of risk assessment and has an inevitable role in predicting the probability of many events occurrence. In addition, variable selection and classification procedures are an integral part of data analysis where the information revolution brings larger datasets with more variables and it has become more difficult to process the streaming high-dimensional time-to-event data in traditional application approaches, specifically in the occurrence of censored observations. Thus, in the presence of large-scale, massive and complex data, specifically in terms of variables, applying proper methods to efficiently simplify such data is desired. Most of the traditional variable selection methods involve computational algorithms in a class of non-deterministic polynomial-time hard (NP-hard) that makes these procedures infeasible. Although recent methods may operate faster, involve different estimation methods and assumptions, their applications are limited, their assumptions cause restrictions, their computational complexities are costly, or their robustness is not consistent. This research is motivated by the importance of the applied variable reduction in complex high-dimensional time-to-event data to avoid aforementioned difficulties in decision-making and facilitate time-to-event data analysis. Quantitative statistical and computational methodologies using combinatorial heuristic algorithms for variable selection and classification are proposed. The purpose of these methodologies is to reduce the volume of the explanatory variables and identify a set of most influential variables in such datasets.http://hdl.handle.net/2047/D20195338
collection NDLTD
sources NDLTD
description In the era of big data, analysis of complex and huge data expends time and money, may cause errors and misinterpretations. Consequently, inaccurate and erroneous reasoning could lead to poor inference and decision-making, sometimes irreversible and catastrophic events. On the other hand, proper management and utilization of valuable data could significantly increase knowledge and reduce cost by preventive actions. In many areas, there are great interests in time and causes of events. Time-to-event data analysis is a kernel of risk assessment and has an inevitable role in predicting the probability of many events occurrence. In addition, variable selection and classification procedures are an integral part of data analysis where the information revolution brings larger datasets with more variables and it has become more difficult to process the streaming high-dimensional time-to-event data in traditional application approaches, specifically in the occurrence of censored observations. Thus, in the presence of large-scale, massive and complex data, specifically in terms of variables, applying proper methods to efficiently simplify such data is desired. Most of the traditional variable selection methods involve computational algorithms in a class of non-deterministic polynomial-time hard (NP-hard) that makes these procedures infeasible. Although recent methods may operate faster, involve different estimation methods and assumptions, their applications are limited, their assumptions cause restrictions, their computational complexities are costly, or their robustness is not consistent. This research is motivated by the importance of the applied variable reduction in complex high-dimensional time-to-event data to avoid aforementioned difficulties in decision-making and facilitate time-to-event data analysis. Quantitative statistical and computational methodologies using combinatorial heuristic algorithms for variable selection and classification are proposed. The purpose of these methodologies is to reduce the volume of the explanatory variables and identify a set of most influential variables in such datasets.
title Analytic for data-driven decision-making in complex high-dimensional time-to-event data
spellingShingle Analytic for data-driven decision-making in complex high-dimensional time-to-event data
title_short Analytic for data-driven decision-making in complex high-dimensional time-to-event data
title_full Analytic for data-driven decision-making in complex high-dimensional time-to-event data
title_fullStr Analytic for data-driven decision-making in complex high-dimensional time-to-event data
title_full_unstemmed Analytic for data-driven decision-making in complex high-dimensional time-to-event data
title_sort analytic for data-driven decision-making in complex high-dimensional time-to-event data
publishDate
url http://hdl.handle.net/2047/D20195338
_version_ 1719407967736954880