Design of a Scalable Cryptographic Processor and Supporting Platforms

博士 === 國立清華大學 === 電機工程學系 === 97 === With the dramatic growth of wired and wireless communication applications, information security becomes more and more important as these applications usually bring more security threats. To secure personal/private information on the public and unprotected network,...

Full description

Bibliographic Details
Main Authors: Wang, Chen-Hsing, 王振興
Other Authors: Wu, Cheng-Wen
Format: Others
Language:en_US
Published: 2009
Online Access:http://ndltd.ncl.edu.tw/handle/48159574837490491132
id ndltd-TW-097NTHU5442062
record_format oai_dc
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立清華大學 === 電機工程學系 === 97 === With the dramatic growth of wired and wireless communication applications, information security becomes more and more important as these applications usually bring more security threats. To secure personal/private information on the public and unprotected network, cryptography is one of the safest and most reliable method based on its robust mathematical foundation. However, the robust mathematic computation usually results in high hardware or time complexity when the cryptographic algorithms are implemented by hard-wired logic or software program, respectively. A good methodology in hardware or software implementation thus is highly desirable. In this thesis, we propose a highly efficient multi-mode multiplier supporting prime field, polynomial field, and matrix-vector multiplications based on an asymmetric word-based Montgomery multiplication algorithm. Since many asymmetric-key cryptographic algorithms are composed of modular exponentiation, modular inversion, or modular multiplication, they can be well addressed by the proposed multi-mode multiplier. In addtion, as the design of the multi-mode multiplier is based on a word-based architecture, it supports a scalable key if the data storage size is large enough; and it provides a flexible trade-off between performance and area cost in multiplier circuit design. We further extend the multi-mode multiplier to deal with the major operation of AES (Advanced Encryption Standard), i.e., matrix-vector multiplication. We apply the composite field arithmetic on the AES round function to reduce its most area-consuming step, i.e., SubBytes. By the composite field arithmetic, the SubBytes step is partitioned into multiplicative inversion over GF((2^4)^2) and some matrix-vector multiplications. The order of four AES steps is rearranged such that the matrices for different steps can be merged into a single matrix. Finally, the AES round is unrolled and recombined; then more matrices can be merged. After the decomposition and regrouping, the original AES round is clearly divided into two parts: matrix-vector multiplications and non-matrix-vector multiplications, where the non-matrix-vector multiplication part only stands for 11% of the total gates in the preliminary analysis. We choose the size, 128 x 32 bits, to implement the multi-mode multiplier circuit, as it can get the maximum benefits for both AES and asymmetric-key cryptographic algorithms. The proposed multi-mode 128 x 32 bits multiplier provides a throughput of 441Mbps and 511Mbps for 256-bit operands over GF(p) and GF(2n) at a clock rate of 100MHz, respectively. With 21.93K additional gates for AES (to implement the nonmatrix- vector multiplication part), it can provide 1.28Gbps, 1.06Gbps, and 0.91Gbps throughput rate for 128-, 192-, and 256-bit keys, respectively. Following the platform-based design methodology, we also propose a generic crypto-SOC and four supporting platforms, where the crypto-SOC is suitable for a wide range of security related protocols in wired and wireless network applications. The four specific platforms, i.e., architecture platform, EDA platform, DFT platform, and prototyping platform, can assist users to develop SOC products more systematically and efficiently. The architecture platform integrates a general-purpose processor and an in-house crypto-processor by a commercial bus system, i.e.,AMBA (Advanced Micro-controller Bus Architecture). The in-house crypto-processor integrates four crypto-engines (AES, RSA, HMAC-MD5/SHA-1, and Random Number Generator (RNG)) and an intelligent crypto-DMA controller by an AHB (Advanced High-performance Bus). Here, the AES, HMAC-MD5/SHA-1, and RNG are contributed by our group members. The crypto-DMA not only manages the bulk data movement between internal crypto-engines and external RAMs, but also manipulates sophisticated flow controls of crypto-engines. The EDA platform provides a complete CAD tool environment which integrates in-house tools and commercial EDA tools to a core-based design flow. The DFT platform provides an SOC test integration methodology, mainly based on two of our in-house tools: STEAC (SOC TEst Aid Console) and BRAINS (BIST for RAM in Seconds). The prototyping platform accelerates function verification of the proposed crypto-SOC when new crypto-engines or components are integrated. Based on the proposed crypto-SOC and four assisting platforms, several prototype chips, for different applications, have been designed and fabricated by different CMOS processes, demonstrating the feasibility and effectiveness of the proposed platforms.
author2 Wu, Cheng-Wen
author_facet Wu, Cheng-Wen
Wang, Chen-Hsing
王振興
author Wang, Chen-Hsing
王振興
spellingShingle Wang, Chen-Hsing
王振興
Design of a Scalable Cryptographic Processor and Supporting Platforms
author_sort Wang, Chen-Hsing
title Design of a Scalable Cryptographic Processor and Supporting Platforms
title_short Design of a Scalable Cryptographic Processor and Supporting Platforms
title_full Design of a Scalable Cryptographic Processor and Supporting Platforms
title_fullStr Design of a Scalable Cryptographic Processor and Supporting Platforms
title_full_unstemmed Design of a Scalable Cryptographic Processor and Supporting Platforms
title_sort design of a scalable cryptographic processor and supporting platforms
publishDate 2009
url http://ndltd.ncl.edu.tw/handle/48159574837490491132
work_keys_str_mv AT wangchenhsing designofascalablecryptographicprocessorandsupportingplatforms
AT wángzhènxìng designofascalablecryptographicprocessorandsupportingplatforms
AT wangchenhsing kědiàoshìmìmǎchùlǐqìshèjìyǔzhīyuánpíngtái
AT wángzhènxìng kědiàoshìmìmǎchùlǐqìshèjìyǔzhīyuánpíngtái
_version_ 1718128417088471040
spelling ndltd-TW-097NTHU54420622015-11-13T04:08:49Z http://ndltd.ncl.edu.tw/handle/48159574837490491132 Design of a Scalable Cryptographic Processor and Supporting Platforms 可調式密碼處理器設計與支援平台 Wang, Chen-Hsing 王振興 博士 國立清華大學 電機工程學系 97 With the dramatic growth of wired and wireless communication applications, information security becomes more and more important as these applications usually bring more security threats. To secure personal/private information on the public and unprotected network, cryptography is one of the safest and most reliable method based on its robust mathematical foundation. However, the robust mathematic computation usually results in high hardware or time complexity when the cryptographic algorithms are implemented by hard-wired logic or software program, respectively. A good methodology in hardware or software implementation thus is highly desirable. In this thesis, we propose a highly efficient multi-mode multiplier supporting prime field, polynomial field, and matrix-vector multiplications based on an asymmetric word-based Montgomery multiplication algorithm. Since many asymmetric-key cryptographic algorithms are composed of modular exponentiation, modular inversion, or modular multiplication, they can be well addressed by the proposed multi-mode multiplier. In addtion, as the design of the multi-mode multiplier is based on a word-based architecture, it supports a scalable key if the data storage size is large enough; and it provides a flexible trade-off between performance and area cost in multiplier circuit design. We further extend the multi-mode multiplier to deal with the major operation of AES (Advanced Encryption Standard), i.e., matrix-vector multiplication. We apply the composite field arithmetic on the AES round function to reduce its most area-consuming step, i.e., SubBytes. By the composite field arithmetic, the SubBytes step is partitioned into multiplicative inversion over GF((2^4)^2) and some matrix-vector multiplications. The order of four AES steps is rearranged such that the matrices for different steps can be merged into a single matrix. Finally, the AES round is unrolled and recombined; then more matrices can be merged. After the decomposition and regrouping, the original AES round is clearly divided into two parts: matrix-vector multiplications and non-matrix-vector multiplications, where the non-matrix-vector multiplication part only stands for 11% of the total gates in the preliminary analysis. We choose the size, 128 x 32 bits, to implement the multi-mode multiplier circuit, as it can get the maximum benefits for both AES and asymmetric-key cryptographic algorithms. The proposed multi-mode 128 x 32 bits multiplier provides a throughput of 441Mbps and 511Mbps for 256-bit operands over GF(p) and GF(2n) at a clock rate of 100MHz, respectively. With 21.93K additional gates for AES (to implement the nonmatrix- vector multiplication part), it can provide 1.28Gbps, 1.06Gbps, and 0.91Gbps throughput rate for 128-, 192-, and 256-bit keys, respectively. Following the platform-based design methodology, we also propose a generic crypto-SOC and four supporting platforms, where the crypto-SOC is suitable for a wide range of security related protocols in wired and wireless network applications. The four specific platforms, i.e., architecture platform, EDA platform, DFT platform, and prototyping platform, can assist users to develop SOC products more systematically and efficiently. The architecture platform integrates a general-purpose processor and an in-house crypto-processor by a commercial bus system, i.e.,AMBA (Advanced Micro-controller Bus Architecture). The in-house crypto-processor integrates four crypto-engines (AES, RSA, HMAC-MD5/SHA-1, and Random Number Generator (RNG)) and an intelligent crypto-DMA controller by an AHB (Advanced High-performance Bus). Here, the AES, HMAC-MD5/SHA-1, and RNG are contributed by our group members. The crypto-DMA not only manages the bulk data movement between internal crypto-engines and external RAMs, but also manipulates sophisticated flow controls of crypto-engines. The EDA platform provides a complete CAD tool environment which integrates in-house tools and commercial EDA tools to a core-based design flow. The DFT platform provides an SOC test integration methodology, mainly based on two of our in-house tools: STEAC (SOC TEst Aid Console) and BRAINS (BIST for RAM in Seconds). The prototyping platform accelerates function verification of the proposed crypto-SOC when new crypto-engines or components are integrated. Based on the proposed crypto-SOC and four assisting platforms, several prototype chips, for different applications, have been designed and fabricated by different CMOS processes, demonstrating the feasibility and effectiveness of the proposed platforms. Wu, Cheng-Wen 吳誠文 2009 學位論文 ; thesis 101 en_US