Summary: | 博士 === 國立臺灣大學 === 電子工程學研究所 === 97 === Because video services become popular on portable devices, power becomes the primary design issue for video coders nowadays. H.264/AVC is an emerging video coding standard which can provide outstanding coding performance, 25-45% bit-rate savings over MPEG4, and thus suitable for portable multimedia applications. Low power consumption is the first-class design issue for portable devices in which the power is limited. In addition, power scalability is also important because it enables such devices to tradeoff compression performance with power consumption according to power levels and application requirements. In the first part of this thesis, efficient techniques that enable a low power and power scalable H.264 encoder are presented. First, motion estimation (ME) normally consumes about 85% of the encoder power. To reduce power consumption, new data reuse (DR) schemes are implemented in the parallel architectures for fast ME algorithms. Second, low power techniques have to be integrated across different design levels. This is not easy because fast ME algorithms are difficult to realize on parallel architectures due to their irregular and sequential natures. Furthermore, gated-clock techniques at the circuit level cannot be effectively supported without system-level considerations. Finally, to enable power scalability on an ASIC encoder, flexibility must be explored on the system and module architectures along with a computationally scalable algorithm. To overcome these problems, hardware-oriented algorithms are proposed to consider the data reuse issue of ME at the algorithm-level. Then, content-aware strategies are utilized to reduce computation and maintain coding performance. Suitable parallel architectures are presented to achieve good data reuse capability for data access power reduction. The proposed flexible system architecture improves hardware efficiency in terms of area with MB pipeline retiming and power with fine-grained clock gating. Finally a 2.8 to 67.2mW H.264 encoder is implemented on a 12.8 $mm^2$ die with 0.18 $um$ CMOS technology. The proposed parallel architectures along with fast algorithms and data reuse schemes enable 77.9% power savings. The power scalability is provided through a flexible system hierarchy that supports content-aware algorithms and module-wise gated clock.
Successful proof-of-concept laboratory experiments on cortically controlled motor prostheses, brain pacemakers and hippocampal prostheses motivate continued development for neural prosthetic systems. Advances in implantable electrode arrays and miniaturized multi-channel recording ICs make feasible of long-duration, wireless and closed-loop experiments on freely moving subjects. To further realize clinically viable neural prostheses, the bulk associated with the external systems has to be eliminated. Thus a miniaturized processing and controlling system interfacing the recording ICs and the actuators in real time is required. Several design issues are critical. Low power and miniaturized area are two primary requirements for implantable devices. A significant computational capability is needed to handle multi-channel neural data in real time. The programmability is essential because of the variability of testing subjects and application requirements. The interfaces to provide real-time actuation feedback should be integrated. A systematic hardware-software hierarchy is essential to facilitate the integration and provide the control flexibility over the functional blocks. In the second part of the thesis, a biomedical MPSoC to real-time process and translate multi-channel neural signals into stimulation currents is proposed on a software-programmable and hardware-accelerated platform for implantable closed-loop neuroprosthses. The on-chip platform comprising heterogeneous multiple processors with the application-specific functionalities reflecting the need of neural prostheses is proposed. Dedicated processors (DPs) of spike sorting and seizure detection are designed to accelerate the computationally intensive processing tasks using customized parallel architectures and memory hierarchies for the cortically controlled prosthetics and epileptic brain pacemakers. General purpose processor (GPP) are embedded to provide the programmability and flexibility of the system. Programmable current stimulation interface is integrated to provide realtime application feedback. According to the implementation results, the 28.3 $mm^2$ chip in 0.35 um CMOS consumes 4.1, 3.5 and 2.9 mW power for three different neuroprosthetic applications. The chip with the lower power and area cost are also demonstrated with the synthesized results in 90 nm process.
|