Summary: | 博士 === 國立清華大學 === 電機工程學系 === 90 === More recently, technological advances have made it
possible to process large amounts of image data; the main
viewpoint of these developed techniques is generally based on the engineering viewpoint. That is, the scheme of these techniques is focused on the quality of results and the development of algorithms. Additionally, one important viewpoint called human vision mechanism is also usually to be used in the recent researches; hence, human vision mechanism is significant information. In order to further realize their potential in the application of image processing, a system for the viewpoint of human-vision base needs to be developed.
Based on human-vision base properties, the top-down
process and bottom-up process are usually adopted techniques and
applied to diverse researches. In our proposed system, we utilize these properties to process from low-level scene analysis for local image properties to high-level scene analysis for semantic description.
For bottom-up process, a image segmentation algorithm is
proposed based on Self-Organizing Map (SOM) methodology which
takes into account the color similarity and spatial relationships of objects within an image. Based on the features of color similarity, an image is first segmented into coarse cluster regions, named planes, using SOM_1 algorithm with a labeling process. The final segmented regions are treated by computing the spatial distance between any two planes and using SOM_2 algorithm with a labeling process. Moreover, the selection of parameters, named the number of iterations and output nodes for SOM algorithm is also discussed in this approach. The segmented objects, which are similar to human perceived, are represented for the proposed approach. Experiments show that this approach is reliable and feasible. It can provide the primary information to further investigate the image content.
For top-down process, we have also investigated the image
content descriptions which adopt the image features and spatial
information. A forward-recall image processing system has been
proposed which contains the forward process with a semantic
description for each segmented object and the recall process with redrawing each of segmented objects based on the semantic
description and spatial location. The forward process involving
the bottom-up and top-down processes is represented the semantic
interpretation obtained using the features of color, texture,
spatial relationships that represent to the indexes, and then
using these indexes to construct the linguistic inference rule
database based on human experiences and knowledge. The
corresponding features for each object in an image can be
obtained. Furthermore, each region can be represented its
corresponding interpretation by operating the inference rule
decision in linguistic data base. Each region, in general, can be almost exacted to interpret its semantic meaning description in our experiments.
Through our researches mentioned above, a human-vision
base image understanding system containing image segmentation,
linguistic meaning interpretation and recall is proposed.
|