High-Performance Computation in Residue Number System Using Floating-Point Arithmetic

Residue number system (RNS) is known for its parallel arithmetic and has been used in recent decades in various important applications, from digital signal processing and deep neural networks to cryptography and high-precision computation. However, comparison, sign identification, overflow detection...

Full description

Bibliographic Details
Main Author: Konstantin Isupov
Format: Article
Language:English
Published: MDPI AG 2021-01-01
Series:Computation
Subjects:
Online Access:https://www.mdpi.com/2079-3197/9/2/9
id doaj-825ddfd1dc0a41a380bf9ad025e85818
record_format Article
spelling doaj-825ddfd1dc0a41a380bf9ad025e858182021-01-22T00:02:40ZengMDPI AGComputation2079-31972021-01-0199910.3390/computation9020009High-Performance Computation in Residue Number System Using Floating-Point ArithmeticKonstantin Isupov0Department of Electronic Computing Machines, Vyatka State University, 610000 Kirov, RussiaResidue number system (RNS) is known for its parallel arithmetic and has been used in recent decades in various important applications, from digital signal processing and deep neural networks to cryptography and high-precision computation. However, comparison, sign identification, overflow detection, and division are still hard to implement in RNS. For such operations, most of the methods proposed in the literature only support small dynamic ranges (up to several tens of bits), so they are only suitable for low-precision applications. We recently proposed a method that supports arbitrary moduli sets with cryptographically sized dynamic ranges, up to several thousands of bits. The practical interest of our method compared to existing methods is that it relies only on very fast standard floating-point operations, so it is suitable for multiple-precision applications and can be efficiently implemented on many general-purpose platforms that support IEEE 754 arithmetic. In this paper, we make further improvements to this method and demonstrate that it can successfully be applied to implement efficient data-parallel primitives operating in the RNS domain, namely finding the maximum element of an array of RNS numbers on graphics processing units. Our experimental results on an NVIDIA RTX 2080 GPU show that for random residues and a 128-moduli set with 2048-bit dynamic range, the proposed implementation reduces the running time by a factor of 39 and the memory consumption by a factor of 13 compared to an implementation based on mixed-radix conversion.https://www.mdpi.com/2079-3197/9/2/9residue number systemdigital arithmetichigh-performance computingdata-parallel primitivesgraphics processing units
collection DOAJ
language English
format Article
sources DOAJ
author Konstantin Isupov
spellingShingle Konstantin Isupov
High-Performance Computation in Residue Number System Using Floating-Point Arithmetic
Computation
residue number system
digital arithmetic
high-performance computing
data-parallel primitives
graphics processing units
author_facet Konstantin Isupov
author_sort Konstantin Isupov
title High-Performance Computation in Residue Number System Using Floating-Point Arithmetic
title_short High-Performance Computation in Residue Number System Using Floating-Point Arithmetic
title_full High-Performance Computation in Residue Number System Using Floating-Point Arithmetic
title_fullStr High-Performance Computation in Residue Number System Using Floating-Point Arithmetic
title_full_unstemmed High-Performance Computation in Residue Number System Using Floating-Point Arithmetic
title_sort high-performance computation in residue number system using floating-point arithmetic
publisher MDPI AG
series Computation
issn 2079-3197
publishDate 2021-01-01
description Residue number system (RNS) is known for its parallel arithmetic and has been used in recent decades in various important applications, from digital signal processing and deep neural networks to cryptography and high-precision computation. However, comparison, sign identification, overflow detection, and division are still hard to implement in RNS. For such operations, most of the methods proposed in the literature only support small dynamic ranges (up to several tens of bits), so they are only suitable for low-precision applications. We recently proposed a method that supports arbitrary moduli sets with cryptographically sized dynamic ranges, up to several thousands of bits. The practical interest of our method compared to existing methods is that it relies only on very fast standard floating-point operations, so it is suitable for multiple-precision applications and can be efficiently implemented on many general-purpose platforms that support IEEE 754 arithmetic. In this paper, we make further improvements to this method and demonstrate that it can successfully be applied to implement efficient data-parallel primitives operating in the RNS domain, namely finding the maximum element of an array of RNS numbers on graphics processing units. Our experimental results on an NVIDIA RTX 2080 GPU show that for random residues and a 128-moduli set with 2048-bit dynamic range, the proposed implementation reduces the running time by a factor of 39 and the memory consumption by a factor of 13 compared to an implementation based on mixed-radix conversion.
topic residue number system
digital arithmetic
high-performance computing
data-parallel primitives
graphics processing units
url https://www.mdpi.com/2079-3197/9/2/9
work_keys_str_mv AT konstantinisupov highperformancecomputationinresiduenumbersystemusingfloatingpointarithmetic
_version_ 1724329595925168128