A study of Spaghetti PCA for time dependent interval data applied to stock prices in Taiwan

碩士 === 國立政治大學 === 統計研究所 === 97 === Interval data are generally defined by the upper and the lower value assumed by a unit for a continuous variable. In this study, we introduce a special type of interval description depending on time. The original idea (Irpino, 2006, Pattern Recognition Letters, 27,...

Full description

Bibliographic Details
Main Authors: Chiu, Ta Ching, 邱大倞
Other Authors: Liu, Hui Mei
Format: Others
Language:en_US
Published: 2009
Online Access:http://ndltd.ncl.edu.tw/handle/84185466678287805292
Description
Summary:碩士 === 國立政治大學 === 統計研究所 === 97 === Interval data are generally defined by the upper and the lower value assumed by a unit for a continuous variable. In this study, we introduce a special type of interval description depending on time. The original idea (Irpino, 2006, Pattern Recognition Letters, 27, 504-513) is that each observation is characterized by an oriented interval of values with a starting and a closing value for each period of observation: for example, the beginning and the closing price of a stock in a week. Several factorial methods have been discovered in order to treat interval data, but not yet for oriented intervals. Irpino presented an extension of principle component analysis to time dependent interval data, or, in general, to oriented intervals. From a geometrical point of view, the proposed approach can be considered as an analysis of oriented segments (nicely called “spaghetti”) defined in a multidimensional space identified by periods. In this paper, we make further extension not only provide the opening and the closing value but also the highest and the lowest value in a week to find out more information that cannot simply obtained from the original idea. Besides, we also use beta distribution to see if there is any improvement corresponding to the original ones. After we make these extensions, many complicated computations should be calculated such as the mean, variance, covariance in order to obtain correlation matrix for PCA. With regard to the value of information, the extended idea with the highest prices and the lowest price is the best choice.