Data Mining and Its Applications to Image Processing

博士 === 國立中正大學 === 資訊工程所 === 95 === In this modern world of computer technology, more and more data of various kinds can be accessed and transmitted over the Internet. However, without the help from a good data mining system, data requesters are more and more likely to drown in a sea of information w...

Full description

Bibliographic Details
Main Authors: Chih-Yang Lin, 林智揚
Other Authors: Chin-Chen Chang
Format: Others
Language:en_US
Published: 2006
Online Access:http://ndltd.ncl.edu.tw/handle/98313033211486996989
id ndltd-TW-095CCU05392013
record_format oai_dc
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立中正大學 === 資訊工程所 === 95 === In this modern world of computer technology, more and more data of various kinds can be accessed and transmitted over the Internet. However, without the help from a good data mining system, data requesters are more and more likely to drown in a sea of information without finding what they are really looking for. Useful data-mining techniques are handy tools that help users extract desired information from huge raw datasets. In real applications, different data mining techniques are developed specifically to gather information from different disciplines and in different forms. In this dissertation, we aim not only to design efficient mining algorithms but also to apply the data mining techniques to image processing. The research begins with developing novel algorithms for data mining. Data mining techniques can be categorized into six types: statistics, concept description, association rules analysis, classification and prediction, clustering analysis, and time series analysis. Here, we focus on just three types: association rules analysis, clustering analysis, and a case of time-series analysis—traversal patterns mining. Mining frequent patterns is the goal of association rules mining and traversal patterns mining. However, the complexity involved in mining frequent patterns is exponential to the number of items. Therefore, the design of an efficient and scalable algorithm for very large databases is one of the keys to the growing power of many applications and, by extension that of the Internet. Many existing methods, such as Apriori and DHP (direct hashing and pruning), use hash techniques extensively to count the frequency of each itemset. Unfortunately, their performance becomes degraded as a result of the collisions in the hash tables. To solve the collision problem, we propose perfect hashing schemes for mining both association rules and traversal patterns. The idea of perfect hashing completely solves the bottleneck of hashing techniques used in association rules mining and traversal patterns mining. To further improve conventional mining performance, parallel association rules mining is also studied for three requirements: (1) less memory usage, (2) less communication among the involved computers over a network, and (3) load balance among computers. Clustering is the process of grouping similar objects in the same cluster that are dissimilar to the objects in other clusters. Previous clustering methods were typically limited to features such as discovering limited shapes, being sensitive to noise, or requiring certain parameters. To solve these problems, a density-based clustering method using a genetic algorithm is proposed to achieve arbitrary shapes clustering and noise detection. In addition, the proposed method provides an interaction mechanism to help users choose appropriate parameters for a given dataset. In addition to developing mining algorithms, we also explored the possibility of introducing mining applications to the field of image processing because how to apply the mined information to other domains is also an important task of data mining. First, the technique of mining association rules is used to predict the unknown index values of a VQ (vector quantization) image to improve the compression rate. Second, the density-based clustering method is applied to VQ codebook training because conventional codebook training techniques are all based on the LBG (Linde-Buzo-Gray) method, which is noise sensitive and can not handle clusters of different shapes, sizes, and densities. Finally, the concept of clustering is also extended to reversible steganographic methods. Steganography is a technique for embedding secret data in an image, audio, or video file so that only the authorized receiver can detect the existence of the secret message. Although steganographic methods have been studied for many years, only a small portion of the literature has focused on reversible steganography, which allows the original cover image to be completely recovered after the secret data is extracted. In this dissertation, we propose a reversible steganographic method for the VQ index table using clustering and relocation strategies to achieve high embedding capacity and image quality. In contrast with clustering, declustering is the opposite of clustering, in that its aim is to put dissimilar patterns together. In this final research study, declustering is applied to reversible steganography for the VQ index table. The main advantages of the use of declustering are ease of implementation, low computational demands, and no requirement for auxiliary data. This strategy can be easily applied to other image formats for the same purpose. All the proposed methods described in this dissertation have been extensively evaluated by theoretical analyses and experimental examination. The evaluation results show that our methods are practical and superior to other state-of-the-art methods.
author2 Chin-Chen Chang
author_facet Chin-Chen Chang
Chih-Yang Lin
林智揚
author Chih-Yang Lin
林智揚
spellingShingle Chih-Yang Lin
林智揚
Data Mining and Its Applications to Image Processing
author_sort Chih-Yang Lin
title Data Mining and Its Applications to Image Processing
title_short Data Mining and Its Applications to Image Processing
title_full Data Mining and Its Applications to Image Processing
title_fullStr Data Mining and Its Applications to Image Processing
title_full_unstemmed Data Mining and Its Applications to Image Processing
title_sort data mining and its applications to image processing
publishDate 2006
url http://ndltd.ncl.edu.tw/handle/98313033211486996989
work_keys_str_mv AT chihyanglin datamininganditsapplicationstoimageprocessing
AT línzhìyáng datamininganditsapplicationstoimageprocessing
AT chihyanglin zīliàowājuéjìshùjíqízàiyǐngxiàngchùlǐzhīyīngyòng
AT línzhìyáng zīliàowājuéjìshùjíqízàiyǐngxiàngchùlǐzhīyīngyòng
_version_ 1716833269904834560
spelling ndltd-TW-095CCU053920132015-10-13T10:45:19Z http://ndltd.ncl.edu.tw/handle/98313033211486996989 Data Mining and Its Applications to Image Processing 資料挖掘技術及其在影像處理之應用 Chih-Yang Lin 林智揚 博士 國立中正大學 資訊工程所 95 In this modern world of computer technology, more and more data of various kinds can be accessed and transmitted over the Internet. However, without the help from a good data mining system, data requesters are more and more likely to drown in a sea of information without finding what they are really looking for. Useful data-mining techniques are handy tools that help users extract desired information from huge raw datasets. In real applications, different data mining techniques are developed specifically to gather information from different disciplines and in different forms. In this dissertation, we aim not only to design efficient mining algorithms but also to apply the data mining techniques to image processing. The research begins with developing novel algorithms for data mining. Data mining techniques can be categorized into six types: statistics, concept description, association rules analysis, classification and prediction, clustering analysis, and time series analysis. Here, we focus on just three types: association rules analysis, clustering analysis, and a case of time-series analysis—traversal patterns mining. Mining frequent patterns is the goal of association rules mining and traversal patterns mining. However, the complexity involved in mining frequent patterns is exponential to the number of items. Therefore, the design of an efficient and scalable algorithm for very large databases is one of the keys to the growing power of many applications and, by extension that of the Internet. Many existing methods, such as Apriori and DHP (direct hashing and pruning), use hash techniques extensively to count the frequency of each itemset. Unfortunately, their performance becomes degraded as a result of the collisions in the hash tables. To solve the collision problem, we propose perfect hashing schemes for mining both association rules and traversal patterns. The idea of perfect hashing completely solves the bottleneck of hashing techniques used in association rules mining and traversal patterns mining. To further improve conventional mining performance, parallel association rules mining is also studied for three requirements: (1) less memory usage, (2) less communication among the involved computers over a network, and (3) load balance among computers. Clustering is the process of grouping similar objects in the same cluster that are dissimilar to the objects in other clusters. Previous clustering methods were typically limited to features such as discovering limited shapes, being sensitive to noise, or requiring certain parameters. To solve these problems, a density-based clustering method using a genetic algorithm is proposed to achieve arbitrary shapes clustering and noise detection. In addition, the proposed method provides an interaction mechanism to help users choose appropriate parameters for a given dataset. In addition to developing mining algorithms, we also explored the possibility of introducing mining applications to the field of image processing because how to apply the mined information to other domains is also an important task of data mining. First, the technique of mining association rules is used to predict the unknown index values of a VQ (vector quantization) image to improve the compression rate. Second, the density-based clustering method is applied to VQ codebook training because conventional codebook training techniques are all based on the LBG (Linde-Buzo-Gray) method, which is noise sensitive and can not handle clusters of different shapes, sizes, and densities. Finally, the concept of clustering is also extended to reversible steganographic methods. Steganography is a technique for embedding secret data in an image, audio, or video file so that only the authorized receiver can detect the existence of the secret message. Although steganographic methods have been studied for many years, only a small portion of the literature has focused on reversible steganography, which allows the original cover image to be completely recovered after the secret data is extracted. In this dissertation, we propose a reversible steganographic method for the VQ index table using clustering and relocation strategies to achieve high embedding capacity and image quality. In contrast with clustering, declustering is the opposite of clustering, in that its aim is to put dissimilar patterns together. In this final research study, declustering is applied to reversible steganography for the VQ index table. The main advantages of the use of declustering are ease of implementation, low computational demands, and no requirement for auxiliary data. This strategy can be easily applied to other image formats for the same purpose. All the proposed methods described in this dissertation have been extensively evaluated by theoretical analyses and experimental examination. The evaluation results show that our methods are practical and superior to other state-of-the-art methods. Chin-Chen Chang 張真誠 2006 學位論文 ; thesis 167 en_US