Summary: | 碩士 === 國立臺灣大學 === 電機工程學研究所 === 97 === Modular arithmetic is the core operation of many cryptographic applications such as RSA, elliptic curve cryptography, and the elliptic curve method of integer factorization (ECM). The speed of ECM, as well as many other applications, is directly determined by the capability of executing modular operations.
ECM is one of the fastest algorithms for factoring large general numbers. It is also an important subroutine in the number field sieves (NFS), the champion method for factoring
RSA numbers and solving discrete logarithm problems. ECM is of practical interest today because it is becoming more and more important in NFS as the problem size increases.
Graphics processing unit (GPU) represents a new class of massively parallel computer architectures that provide immense aggregated throughput of computation. These massively parallel computers are scalable and efficient at exploiting the enormous transistor budget afforded by Moore’s law. GPU in particular has a competitive edge in terms of price-performance ratio, thanks to video game industry.
In this thesis, we present MAGE, a GPU-based, high-throughput, massively parallel modular arithmetic unit. With the design and techniques presented in this thesis, we are able to deliver more than 480 million 210-bit modular multiplications per second on a single graphics card, setting a new record in terms of the highest single-chip performance, the highest performance per dollar, as well as the highest performance per watt among all known implementations up to date.
We also report a record-setting ECM implementation based on MAGE, delivering a throughput of 4928 curves per second for ECM stage 1 with B1 = 8192 for 210-bit integers on a single graphics card. With the help of MAGE, our ECM implementation outperforms the best previous results published in EUROCRYPT 2009 by a factor of more than four
times.
|