Area- and energy-efficient CORDIC accelerators in deep sub-micron CMOS technologies
The COordinate Rotate DIgital Computer (CORDIC) algorithm is a well known versatile approach and is widely applied in today's SoCs for especially but not restricted to digital communications. Dedicated CORDIC blocks can be implemented in deep sub-micron CMOS technologies at very low area and en...
Main Authors: | , |
---|---|
Format: | Article |
Language: | deu |
Published: |
Copernicus Publications
2012-09-01
|
Series: | Advances in Radio Science |
Online Access: | http://www.adv-radio-sci.net/10/207/2012/ars-10-207-2012.pdf |
id |
doaj-5fc0d23cb0e84496b63b69dda908b825 |
---|---|
record_format |
Article |
spelling |
doaj-5fc0d23cb0e84496b63b69dda908b8252020-11-24T23:11:28ZdeuCopernicus PublicationsAdvances in Radio Science 1684-99651684-99732012-09-011020721310.5194/ars-10-207-2012Area- and energy-efficient CORDIC accelerators in deep sub-micron CMOS technologiesU. Vishnoi0T. G. Noll1Chair of Electrical Engineering and Computer Systems, RWTH Aachen University, Aachen, GermanyChair of Electrical Engineering and Computer Systems, RWTH Aachen University, Aachen, GermanyThe COordinate Rotate DIgital Computer (CORDIC) algorithm is a well known versatile approach and is widely applied in today's SoCs for especially but not restricted to digital communications. Dedicated CORDIC blocks can be implemented in deep sub-micron CMOS technologies at very low area and energy costs and are attractive to be used as hardware accelerators for Application Specific Instruction Processors (ASIPs). Thereby, overcoming the well known energy vs. flexibility conflict. Optimizing Global Navigation Satellite System (GNSS) receivers to reduce the hardware complexity is an important research topic at present. In such receivers CORDIC accelerators can be used for digital baseband processing (fixed-point) and in Position-Velocity-Time estimation (floating-point). A micro architecture well suited to such applications is presented. This architecture is parameterized according to the wordlengths as well as the number of iterations and can be easily extended for floating point data format. Moreover, area can be traded for throughput by partially or even fully unrolling the iterations, whereby the degree of pipelining is organized with one CORDIC iteration per cycle. From the architectural description, the macro layout can be generated fully automatically using an in-house datapath generator tool. Since the adders and shifters play an important role in optimizing the CORDIC block, they must be carefully optimized for high area and energy efficiency in the underlying technology. So, for this purpose carry-select adders and logarithmic shifters have been chosen. Device dimensioning was automatically optimized with respect to dynamic and static power, area and performance using the in-house tool. The fully sequential CORDIC block for fixed-point digital baseband processing features a wordlength of 16 bits, requires 5232 transistors, which is implemented in a 40-nm CMOS technology and occupies a silicon area of 1560 μm<sup>2</sup> only. Maximum clock frequency from circuit simulation of extracted netlist is 768 MHz under typical, and 463 MHz under worst case technology and application corner conditions, respectively. Simulated dynamic power dissipation is 0.24 uW MHz<sup>−1</sup> at 0.9 V; static power is 38 uW in slow corner, 65 uW in typical corner and 518 uW in fast corner, respectively. The latter can be reduced by 43% in a 40-nm CMOS technology using 0.5 V reverse-backbias. These features are compared with the results from different design styles as well as with an implementation in 28-nm CMOS technology. It is interesting that in the latter case area scales as expected, but worst case performance and energy do not scale well anymore.http://www.adv-radio-sci.net/10/207/2012/ars-10-207-2012.pdf |
collection |
DOAJ |
language |
deu |
format |
Article |
sources |
DOAJ |
author |
U. Vishnoi T. G. Noll |
spellingShingle |
U. Vishnoi T. G. Noll Area- and energy-efficient CORDIC accelerators in deep sub-micron CMOS technologies Advances in Radio Science |
author_facet |
U. Vishnoi T. G. Noll |
author_sort |
U. Vishnoi |
title |
Area- and energy-efficient CORDIC accelerators in deep sub-micron CMOS technologies |
title_short |
Area- and energy-efficient CORDIC accelerators in deep sub-micron CMOS technologies |
title_full |
Area- and energy-efficient CORDIC accelerators in deep sub-micron CMOS technologies |
title_fullStr |
Area- and energy-efficient CORDIC accelerators in deep sub-micron CMOS technologies |
title_full_unstemmed |
Area- and energy-efficient CORDIC accelerators in deep sub-micron CMOS technologies |
title_sort |
area- and energy-efficient cordic accelerators in deep sub-micron cmos technologies |
publisher |
Copernicus Publications |
series |
Advances in Radio Science |
issn |
1684-9965 1684-9973 |
publishDate |
2012-09-01 |
description |
The COordinate Rotate DIgital Computer (CORDIC) algorithm is a well known
versatile approach and is widely applied in today's SoCs for especially but
not restricted to digital communications. Dedicated CORDIC blocks can be
implemented in deep sub-micron CMOS technologies at very low area and energy
costs and are attractive to be used as hardware accelerators for Application
Specific Instruction Processors (ASIPs). Thereby, overcoming the well known
energy vs. flexibility conflict. Optimizing Global Navigation Satellite
System (GNSS) receivers to reduce the hardware complexity is an important
research topic at present. In such receivers CORDIC accelerators can be used
for digital baseband processing (fixed-point) and in Position-Velocity-Time
estimation (floating-point). A micro architecture well suited to such
applications is presented. This architecture is parameterized according to
the wordlengths as well as the number of iterations and can be easily
extended for floating point data format. Moreover, area can be traded for
throughput by partially or even fully unrolling the iterations, whereby the
degree of pipelining is organized with one CORDIC iteration per cycle. From
the architectural description, the macro layout can be generated fully
automatically using an in-house datapath generator tool. Since the adders
and shifters play an important role in optimizing the CORDIC block, they
must be carefully optimized for high area and energy efficiency in the
underlying technology. So, for this purpose carry-select adders and
logarithmic shifters have been chosen. Device dimensioning was automatically
optimized with respect to dynamic and static power, area and performance
using the in-house tool. The fully sequential CORDIC block for fixed-point
digital baseband processing features a wordlength of 16 bits, requires 5232
transistors, which is implemented in a 40-nm CMOS technology and occupies a
silicon area of 1560 μm<sup>2</sup> only. Maximum clock frequency from
circuit simulation of extracted netlist is 768 MHz under typical, and 463 MHz under worst case technology and application corner conditions,
respectively. Simulated dynamic power dissipation is 0.24 uW MHz<sup>−1</sup> at 0.9 V; static power is 38 uW in slow corner, 65 uW in typical corner and 518 uW in
fast corner, respectively. The latter can be reduced by 43% in a 40-nm
CMOS technology using 0.5 V reverse-backbias. These features are compared
with the results from different design styles as well as with an
implementation in 28-nm CMOS technology. It is interesting that in the
latter case area scales as expected, but worst case performance and energy
do not scale well anymore. |
url |
http://www.adv-radio-sci.net/10/207/2012/ars-10-207-2012.pdf |
work_keys_str_mv |
AT uvishnoi areaandenergyefficientcordicacceleratorsindeepsubmicroncmostechnologies AT tgnoll areaandenergyefficientcordicacceleratorsindeepsubmicroncmostechnologies |
_version_ |
1725604324246028288 |