HiSeqGAN: High-dimensional Sequence Synthesis and Prediction

碩士 === 國立政治大學 === 資訊管理學系 === 107 === High-dimensional data sequences constantly appear in practice. State-of-the-art models such as recurrent neural networks suffer prediction accuracy from complex relations among values of attributes. Adopting unsupervised clustering that clusters data based on the...

Full description

Bibliographic Details
Main Authors: Tien, Yun-Chieh, 田韻杰
Other Authors: Yu, Fang
Format: Others
Language:en_US
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/hmurs8
Description
Summary:碩士 === 國立政治大學 === 資訊管理學系 === 107 === High-dimensional data sequences constantly appear in practice. State-of-the-art models such as recurrent neural networks suffer prediction accuracy from complex relations among values of attributes. Adopting unsupervised clustering that clusters data based on their attribute value similarity results data in lower dimensions that can be structured in a hierarchical relation. It is essential to consider these data relations to improve the performance of training models. In this work, we propose a new approach to synthesize and predict sequences of data that are structured in a hierarchy. Specifically, we adopt a new hierarchical data encoding and seamlessly modify loss functions of SeqGAN as our training model to synthesize data sequences. In practice, we first use the hierarchical clustering algorithm, GHSOM, to cluster our training data. By relabelling a sample with the cluster that it falls to, we are able to use the GHSOM map to identify the hierarchical relation of samples. We then converse the clusters to the coordinate vectors with our hierarchical data encoding algorithm and replace the loss function with maximizing cosine similarity in the SeqGAN model to synthesize cluster sequences. Using the synthesized sequences, we are able to achieve better performance on high-dimension data training and prediction compared to the state-of-the-art models.