Area- and energy-efficient CORDIC accelerators in deep sub-micron CMOS technologies

The COordinate Rotate DIgital Computer (CORDIC) algorithm is a well known versatile approach and is widely applied in today's SoCs for especially but not restricted to digital communications. Dedicated CORDIC blocks can be implemented in deep sub-micron CMOS technologies at very low area and en...

Full description

Bibliographic Details
Main Authors: U. Vishnoi, T. G. Noll
Format: Article
Language:deu
Published: Copernicus Publications 2012-09-01
Series:Advances in Radio Science
Online Access:http://www.adv-radio-sci.net/10/207/2012/ars-10-207-2012.pdf
id doaj-5fc0d23cb0e84496b63b69dda908b825
record_format Article
spelling doaj-5fc0d23cb0e84496b63b69dda908b8252020-11-24T23:11:28ZdeuCopernicus PublicationsAdvances in Radio Science 1684-99651684-99732012-09-011020721310.5194/ars-10-207-2012Area- and energy-efficient CORDIC accelerators in deep sub-micron CMOS technologiesU. Vishnoi0T. G. Noll1Chair of Electrical Engineering and Computer Systems, RWTH Aachen University, Aachen, GermanyChair of Electrical Engineering and Computer Systems, RWTH Aachen University, Aachen, GermanyThe COordinate Rotate DIgital Computer (CORDIC) algorithm is a well known versatile approach and is widely applied in today's SoCs for especially but not restricted to digital communications. Dedicated CORDIC blocks can be implemented in deep sub-micron CMOS technologies at very low area and energy costs and are attractive to be used as hardware accelerators for Application Specific Instruction Processors (ASIPs). Thereby, overcoming the well known energy vs. flexibility conflict. Optimizing Global Navigation Satellite System (GNSS) receivers to reduce the hardware complexity is an important research topic at present. In such receivers CORDIC accelerators can be used for digital baseband processing (fixed-point) and in Position-Velocity-Time estimation (floating-point). A micro architecture well suited to such applications is presented. This architecture is parameterized according to the wordlengths as well as the number of iterations and can be easily extended for floating point data format. Moreover, area can be traded for throughput by partially or even fully unrolling the iterations, whereby the degree of pipelining is organized with one CORDIC iteration per cycle. From the architectural description, the macro layout can be generated fully automatically using an in-house datapath generator tool. Since the adders and shifters play an important role in optimizing the CORDIC block, they must be carefully optimized for high area and energy efficiency in the underlying technology. So, for this purpose carry-select adders and logarithmic shifters have been chosen. Device dimensioning was automatically optimized with respect to dynamic and static power, area and performance using the in-house tool. The fully sequential CORDIC block for fixed-point digital baseband processing features a wordlength of 16 bits, requires 5232 transistors, which is implemented in a 40-nm CMOS technology and occupies a silicon area of 1560 μm<sup>2</sup> only. Maximum clock frequency from circuit simulation of extracted netlist is 768 MHz under typical, and 463 MHz under worst case technology and application corner conditions, respectively. Simulated dynamic power dissipation is 0.24 uW MHz<sup>−1</sup> at 0.9 V; static power is 38 uW in slow corner, 65 uW in typical corner and 518 uW in fast corner, respectively. The latter can be reduced by 43% in a 40-nm CMOS technology using 0.5 V reverse-backbias. These features are compared with the results from different design styles as well as with an implementation in 28-nm CMOS technology. It is interesting that in the latter case area scales as expected, but worst case performance and energy do not scale well anymore.http://www.adv-radio-sci.net/10/207/2012/ars-10-207-2012.pdf
collection DOAJ
language deu
format Article
sources DOAJ
author U. Vishnoi
T. G. Noll
spellingShingle U. Vishnoi
T. G. Noll
Area- and energy-efficient CORDIC accelerators in deep sub-micron CMOS technologies
Advances in Radio Science
author_facet U. Vishnoi
T. G. Noll
author_sort U. Vishnoi
title Area- and energy-efficient CORDIC accelerators in deep sub-micron CMOS technologies
title_short Area- and energy-efficient CORDIC accelerators in deep sub-micron CMOS technologies
title_full Area- and energy-efficient CORDIC accelerators in deep sub-micron CMOS technologies
title_fullStr Area- and energy-efficient CORDIC accelerators in deep sub-micron CMOS technologies
title_full_unstemmed Area- and energy-efficient CORDIC accelerators in deep sub-micron CMOS technologies
title_sort area- and energy-efficient cordic accelerators in deep sub-micron cmos technologies
publisher Copernicus Publications
series Advances in Radio Science
issn 1684-9965
1684-9973
publishDate 2012-09-01
description The COordinate Rotate DIgital Computer (CORDIC) algorithm is a well known versatile approach and is widely applied in today's SoCs for especially but not restricted to digital communications. Dedicated CORDIC blocks can be implemented in deep sub-micron CMOS technologies at very low area and energy costs and are attractive to be used as hardware accelerators for Application Specific Instruction Processors (ASIPs). Thereby, overcoming the well known energy vs. flexibility conflict. Optimizing Global Navigation Satellite System (GNSS) receivers to reduce the hardware complexity is an important research topic at present. In such receivers CORDIC accelerators can be used for digital baseband processing (fixed-point) and in Position-Velocity-Time estimation (floating-point). A micro architecture well suited to such applications is presented. This architecture is parameterized according to the wordlengths as well as the number of iterations and can be easily extended for floating point data format. Moreover, area can be traded for throughput by partially or even fully unrolling the iterations, whereby the degree of pipelining is organized with one CORDIC iteration per cycle. From the architectural description, the macro layout can be generated fully automatically using an in-house datapath generator tool. Since the adders and shifters play an important role in optimizing the CORDIC block, they must be carefully optimized for high area and energy efficiency in the underlying technology. So, for this purpose carry-select adders and logarithmic shifters have been chosen. Device dimensioning was automatically optimized with respect to dynamic and static power, area and performance using the in-house tool. The fully sequential CORDIC block for fixed-point digital baseband processing features a wordlength of 16 bits, requires 5232 transistors, which is implemented in a 40-nm CMOS technology and occupies a silicon area of 1560 μm<sup>2</sup> only. Maximum clock frequency from circuit simulation of extracted netlist is 768 MHz under typical, and 463 MHz under worst case technology and application corner conditions, respectively. Simulated dynamic power dissipation is 0.24 uW MHz<sup>−1</sup> at 0.9 V; static power is 38 uW in slow corner, 65 uW in typical corner and 518 uW in fast corner, respectively. The latter can be reduced by 43% in a 40-nm CMOS technology using 0.5 V reverse-backbias. These features are compared with the results from different design styles as well as with an implementation in 28-nm CMOS technology. It is interesting that in the latter case area scales as expected, but worst case performance and energy do not scale well anymore.
url http://www.adv-radio-sci.net/10/207/2012/ars-10-207-2012.pdf
work_keys_str_mv AT uvishnoi areaandenergyefficientcordicacceleratorsindeepsubmicroncmostechnologies
AT tgnoll areaandenergyefficientcordicacceleratorsindeepsubmicroncmostechnologies
_version_ 1725604324246028288