Two Level Data Extraction Scheme for Geographical Ocean Data

碩士 === 國立臺北大學 === 通訊工程研究所 === 98 === There are large scientific data archives manage and store huge quantities of data with the help of metadata, deal with this data throughout its life cycle, and focus on particular scientific domains. An effective technology for searching desired data becomes incr...

Full description

Bibliographic Details
Main Authors: Hsuan Jen Lai, 賴宣任
Other Authors: Yue Shan Chang
Format: Others
Language:zh-TW
Published: 2010
Online Access:http://ndltd.ncl.edu.tw/handle/82449432820973820042
Description
Summary:碩士 === 國立臺北大學 === 通訊工程研究所 === 98 === There are large scientific data archives manage and store huge quantities of data with the help of metadata, deal with this data throughout its life cycle, and focus on particular scientific domains. An effective technology for searching desired data becomes increasingly important. We propose two level data extraction scheme. As well known, metadata can be used for assisting the information retrieval. Using metadata to present the file system also reduces the processing required to handle operations. While the number of metadata file is daily incremental with the number of scientific data file increased, utilizing metadata file to help accessing daily-incremental data set is increasing difficult. In this thesis, we first propose a Metadata Classification approach to classify the records in metadata file to a month-level metadata and construct a two dimension array to store the classified records, which can assist user program quickly inquiring the target files in order to search desired data. To further improve the performance, we modify the MC approach and present a Modified Metadata Classification (MMC) approach. In the MMC, the Metadata Classifier not only reclassifies the day-level metadata to a year-level metadata, but also adjusts the granularity of GridMap. In addition, in data level we propose data distribution scheme that is the important issue for making lookup efficiently. We conduct some experiments to evaluate the performance of MC and MMC and make a comparison with a traditional approach (named Raw approach) and existing system. It shows that the MMC have better performance than MC and Raw approach while granularity of GridMap increasing. And we also conduct experiments to performance of our data extraction scheme and Raw approach and SQL. It show our approach have better performance.