Summary: | 碩士 === 臺灣大學 === 電子工程學研究所 === 98 === Due to the development of semiconductor technology, a Consumer Electronics(CE) product with huge storage device might include different functionalities besides basic communication, such as taking or storing photos. This makes the amount of multimedia data stored on these products very large. This large amount of data has to be accessed intelligently, and thus managing multimedia content becomes an urgent task. To enable efficient data management, the semantic information of the multimedia content has to be extracted for further manipulation, and machine learning algorithms play an important role in this area. In embedded
systems for CE products, the traditional CPU and ASIC cannot satisfy both the flexibility and performance based on their architectures, so the exploration of new design methodologies and solutions are needed for next-generation applications.
In this thesis, the hardware architectures of the Gaussian Mixture Model (GMM) and multi-class Support Vector Machine (SVM) machine learning algorithms are proposed to accelerate the image semantic processing and concept feature extraction process in multimedia content analysis. By adopting the local to global concept feature extraction method, the low-level features of the image patches are analyzed using the machine learning algorithms, such as GMM or SVM, and thus the patches can be classified to the pre-defined concept classes. After gathering the classification results of the blocks from the whole image, the semantic concepts can be extracted to represent the image. The mapping process bridges the gap between the low-level feature representation and human perception. Since
the computations involved in this process are intensive and burdens the resource limited embedded system, the proposed hardware acceleration schemes are used to deal with this problem. The proposed GMM hardware architecture provides high speed-up and good flexibility by combining the parallelism and folding design technique in different levels. The system can process the computations involved in one Gaussian in only one cycle. Since in the GMM algorithm, each classifier that models the data in one class might have different number of Gaussian distributions, it is more
efficient to fold the hardware in the class level to support one Gaussian'' s computation at once. By doing so, the user will have more flexibility to set the number of Gaussians per class and the number of classes desired. The proposed multiclass SVM hardware architecture is designed under thorough analyses to meet the trade-off between hardware costs and real-time processing demand. The design
is further optimized by the reconfigurable structure to provide different operating modes to satisfy the users'' various demands and make good use of the memories. The flexibility includes the three kernel functions, the wide range of the value of parameters, adjustable bit-precision with run-length encoding, and two operating speed modes. When the number of support vectors are too large to be stored, the proposed reload scheme can also be adopted to handle this scenario.
In short, the contribution of this thesis consists essentially of a flexible high throughput GMM hardware architecture for image semantic processing and a multi-class SVM hardware architecture design methodology with an optimized reconfigurable prototype for real-time multimedia content analysis. Thorough analyses of the SVM hardware architecture to deal with different scenarios using
the reconfigurable hardware architecture are also shown and discussed. The contents of this thesis can be regarded as a series of solutions to the implementation of the hardware architecture of supervised machine learning algorithms, such
as GMM and SVM, for multimedia content analysis in CE products.
|