Statistical Modeling of Carbon Dioxide and Cluster Analysis of Time Dependent Information: Lag Target Time Series Clustering, Multi-Factor Time Series Clustering, and Multi-Level Time Series Clustering

The current study consists of three major parts. Statistical modeling, the connection between statistical modeling and cluster analysis, and proposing new methods to cluster time dependent information. First, we perform a statistical modeling of the Carbon Dioxide (CO2) emission in South Korea in or...

Full description

Bibliographic Details
Main Author: Kim, Doo Young
Format: Others
Published: Scholar Commons 2016
Subjects:
Online Access:http://scholarcommons.usf.edu/etd/6277
http://scholarcommons.usf.edu/cgi/viewcontent.cgi?article=7473&context=etd
id ndltd-USF-oai-scholarcommons.usf.edu-etd-7473
record_format oai_dc
spelling ndltd-USF-oai-scholarcommons.usf.edu-etd-74732017-09-20T05:26:20Z Statistical Modeling of Carbon Dioxide and Cluster Analysis of Time Dependent Information: Lag Target Time Series Clustering, Multi-Factor Time Series Clustering, and Multi-Level Time Series Clustering Kim, Doo Young The current study consists of three major parts. Statistical modeling, the connection between statistical modeling and cluster analysis, and proposing new methods to cluster time dependent information. First, we perform a statistical modeling of the Carbon Dioxide (CO2) emission in South Korea in order to identify the attributable variables including interaction effects. One of the hot issues in the earth in 21st century is Global warming which is caused by the marriage between atmospheric temperature and CO2 in the atmosphere. When we confront this global problem, we first need to verify what causes the problem then we can find out how to solve the problem. Thereby, we find and rank the attributable variables and their interactions based on their semipartial correlation and compare our findings with the results from the United States and European Union. This comparison shows that the number one contributing variable in South Korea and the United States is Liquid Fuels while it is the number 8 ranked in EU. This comparison provides the evidence to support regional policies and not global, to control CO2 in an optimal level in our atmosphere. Second, we study regional behavior of the atmospheric CO2 in the United States. Utilizing the longitudinal transitional modeling scheme, we calculate transitional probabilities based on effects from five end-use sectors that produce most of the CO2 in our atmosphere, that is, the commercial sector, electric power sector, industrial sector, residential sector, and the transportation sector. Then, using those transitional probabilities we perform a hierarchical clustering procedure to classify the regions with similar characteristics based on nine US climate regions. This study suggests that our elected officials can proceed to legislate regional policies by end-use sectors in order to maintain the optimal level of the atmospheric CO2 which is required by global consensus. Third, we propose new methods to cluster time dependent information. It is almost impossible to find data that are not time dependent among floods of information that we have nowadays, and it needs not to emphasize the importance of data mining of the time dependent information. The first method we propose is called “Lag Target Time Series Clustering (LTTC)” which identifies actual level of time dependencies among clustering objects. The second method we propose is the “Multi-Factor Time Series Clustering (MFTC)” which allows us to consider the distance in multi-dimensional space by including multiple information at a time. The last method we propose is the “Multi-Level Time Series Clustering (MLTC)” which is especially important when you have short term varying time series responses to cluster. That is, we extract only pure lag effect from LTTC. The new methods that we propose give excellent results when applied to time dependent clustering. Finally, we develop appropriate algorithm driven by the analytical structure of the proposed methods to cluster financial information of the ten business sectors of the N.Y. Stock Exchange. We used in our clustering scheme 497 stocks that constitute the S&P 500 stocks. We illustrated the usefulness of the subject study by structuring diversified financial portfolio. 2016-06-02T07:00:00Z text application/pdf http://scholarcommons.usf.edu/etd/6277 http://scholarcommons.usf.edu/cgi/viewcontent.cgi?article=7473&context=etd default Graduate Theses and Dissertations Scholar Commons Global Warming Transitional Modeling Clustering T ime Dependent Information Cancer Mortality Rates Environmental Sciences Medicine and Health Sciences Statistics and Probability
collection NDLTD
format Others
sources NDLTD
topic Global Warming
Transitional Modeling
Clustering
T ime Dependent Information
Cancer Mortality Rates
Environmental Sciences
Medicine and Health Sciences
Statistics and Probability
spellingShingle Global Warming
Transitional Modeling
Clustering
T ime Dependent Information
Cancer Mortality Rates
Environmental Sciences
Medicine and Health Sciences
Statistics and Probability
Kim, Doo Young
Statistical Modeling of Carbon Dioxide and Cluster Analysis of Time Dependent Information: Lag Target Time Series Clustering, Multi-Factor Time Series Clustering, and Multi-Level Time Series Clustering
description The current study consists of three major parts. Statistical modeling, the connection between statistical modeling and cluster analysis, and proposing new methods to cluster time dependent information. First, we perform a statistical modeling of the Carbon Dioxide (CO2) emission in South Korea in order to identify the attributable variables including interaction effects. One of the hot issues in the earth in 21st century is Global warming which is caused by the marriage between atmospheric temperature and CO2 in the atmosphere. When we confront this global problem, we first need to verify what causes the problem then we can find out how to solve the problem. Thereby, we find and rank the attributable variables and their interactions based on their semipartial correlation and compare our findings with the results from the United States and European Union. This comparison shows that the number one contributing variable in South Korea and the United States is Liquid Fuels while it is the number 8 ranked in EU. This comparison provides the evidence to support regional policies and not global, to control CO2 in an optimal level in our atmosphere. Second, we study regional behavior of the atmospheric CO2 in the United States. Utilizing the longitudinal transitional modeling scheme, we calculate transitional probabilities based on effects from five end-use sectors that produce most of the CO2 in our atmosphere, that is, the commercial sector, electric power sector, industrial sector, residential sector, and the transportation sector. Then, using those transitional probabilities we perform a hierarchical clustering procedure to classify the regions with similar characteristics based on nine US climate regions. This study suggests that our elected officials can proceed to legislate regional policies by end-use sectors in order to maintain the optimal level of the atmospheric CO2 which is required by global consensus. Third, we propose new methods to cluster time dependent information. It is almost impossible to find data that are not time dependent among floods of information that we have nowadays, and it needs not to emphasize the importance of data mining of the time dependent information. The first method we propose is called “Lag Target Time Series Clustering (LTTC)” which identifies actual level of time dependencies among clustering objects. The second method we propose is the “Multi-Factor Time Series Clustering (MFTC)” which allows us to consider the distance in multi-dimensional space by including multiple information at a time. The last method we propose is the “Multi-Level Time Series Clustering (MLTC)” which is especially important when you have short term varying time series responses to cluster. That is, we extract only pure lag effect from LTTC. The new methods that we propose give excellent results when applied to time dependent clustering. Finally, we develop appropriate algorithm driven by the analytical structure of the proposed methods to cluster financial information of the ten business sectors of the N.Y. Stock Exchange. We used in our clustering scheme 497 stocks that constitute the S&P 500 stocks. We illustrated the usefulness of the subject study by structuring diversified financial portfolio.
author Kim, Doo Young
author_facet Kim, Doo Young
author_sort Kim, Doo Young
title Statistical Modeling of Carbon Dioxide and Cluster Analysis of Time Dependent Information: Lag Target Time Series Clustering, Multi-Factor Time Series Clustering, and Multi-Level Time Series Clustering
title_short Statistical Modeling of Carbon Dioxide and Cluster Analysis of Time Dependent Information: Lag Target Time Series Clustering, Multi-Factor Time Series Clustering, and Multi-Level Time Series Clustering
title_full Statistical Modeling of Carbon Dioxide and Cluster Analysis of Time Dependent Information: Lag Target Time Series Clustering, Multi-Factor Time Series Clustering, and Multi-Level Time Series Clustering
title_fullStr Statistical Modeling of Carbon Dioxide and Cluster Analysis of Time Dependent Information: Lag Target Time Series Clustering, Multi-Factor Time Series Clustering, and Multi-Level Time Series Clustering
title_full_unstemmed Statistical Modeling of Carbon Dioxide and Cluster Analysis of Time Dependent Information: Lag Target Time Series Clustering, Multi-Factor Time Series Clustering, and Multi-Level Time Series Clustering
title_sort statistical modeling of carbon dioxide and cluster analysis of time dependent information: lag target time series clustering, multi-factor time series clustering, and multi-level time series clustering
publisher Scholar Commons
publishDate 2016
url http://scholarcommons.usf.edu/etd/6277
http://scholarcommons.usf.edu/cgi/viewcontent.cgi?article=7473&context=etd
work_keys_str_mv AT kimdooyoung statisticalmodelingofcarbondioxideandclusteranalysisoftimedependentinformationlagtargettimeseriesclusteringmultifactortimeseriesclusteringandmultileveltimeseriesclustering
_version_ 1718539777833172992