Design and implementation of fast and hardware‐efficient parallel processing elements to set full and partial permutations in Beneš networks

Abstract A new design for parallel and distributed processing elements (PEs) is proposed to configure Beneš networks based on a novel parallel algorithm that can realise full and partial permutations in a unified manner with very little overhead time and extra hardware. The proposed design reduces t...

Full description

Bibliographic Details
Main Authors: Labson Koloko, Takahiro Matsumoto, Hitoshi Obara
Format: Article
Language:English
Published: Wiley 2021-06-01
Series:The Journal of Engineering
Online Access:https://doi.org/10.1049/tje2.12037
Description
Summary:Abstract A new design for parallel and distributed processing elements (PEs) is proposed to configure Beneš networks based on a novel parallel algorithm that can realise full and partial permutations in a unified manner with very little overhead time and extra hardware. The proposed design reduces the hardware complexity of PEs from O(N2)to O(N(log2N)2) due to a distributed architecture. In the proposed design, asynchronous operation was introduced in parts to reduce the time complexity per PE stage down to O(1) within a certain N, while it takes O(log2N) time per PE stage in conventional algorithms. A prototype parallel was constructed and PEs were distributed in a field programmable gate array to investigate performance for the switch size of N = 4 to 32. The experimental results demonstrate that the proposed design outperforms a recent method by at least several times in terms of hardware and processing time complexities.
ISSN:2051-3305