The search functionality is under construction.

The search functionality is under construction.

Fast Fourier Transform (FFT) is an important algorithm in many digital signal processing applications, and it often requires parallel implementation for high throughput. In this paper, we first present the SmartCell coarse-grained reconfigurable architecture targeted for stream processing. A SmartCell prototype integrates 64 processing elements, configurable interconnections, and dedicated instruction and data memories into a single chip, which is able to provide high performance parallel processing while maintaining post-fabrication flexibility. Subsequently, we present a parallel FFT architecture targeted for multi-core platforms computing systems. This algorithm provides an optimized data flow pattern that reduces both communication and configuration overheads. The proposed parallel FFT algorithm is then mapped onto the SmartCell prototype device. Results show that the parallel FFT implementation on SmartCell is about 14.9 and 2.7 times faster than network-on-chip (NoC) and MorphoSys implementations, respectively. SmartCell also achieves the energy efficiency gains of 2.1 and 28.9 when compared with FPGA and DSP implementations.

- Publication
- IEICE TRANSACTIONS on Electronics Vol.E93-C No.3 pp.407-415

- Publication Date
- 2010/03/01

- Publicized

- Online ISSN
- 1745-1353

- DOI
- 10.1587/transele.E93.C.407

- Type of Manuscript
- PAPER

- Category
- Integrated Electronics

The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.

Copy

Cao LIANG, Xinming HUANG, "Mapping Parallel FFT Algorithm onto SmartCell Coarse-Grained Reconfigurable Architecture" in IEICE TRANSACTIONS on Electronics,
vol. E93-C, no. 3, pp. 407-415, March 2010, doi: 10.1587/transele.E93.C.407.

Abstract: Fast Fourier Transform (FFT) is an important algorithm in many digital signal processing applications, and it often requires parallel implementation for high throughput. In this paper, we first present the SmartCell coarse-grained reconfigurable architecture targeted for stream processing. A SmartCell prototype integrates 64 processing elements, configurable interconnections, and dedicated instruction and data memories into a single chip, which is able to provide high performance parallel processing while maintaining post-fabrication flexibility. Subsequently, we present a parallel FFT architecture targeted for multi-core platforms computing systems. This algorithm provides an optimized data flow pattern that reduces both communication and configuration overheads. The proposed parallel FFT algorithm is then mapped onto the SmartCell prototype device. Results show that the parallel FFT implementation on SmartCell is about 14.9 and 2.7 times faster than network-on-chip (NoC) and MorphoSys implementations, respectively. SmartCell also achieves the energy efficiency gains of 2.1 and 28.9 when compared with FPGA and DSP implementations.

URL: https://global.ieice.org/en_transactions/electronics/10.1587/transele.E93.C.407/_p

Copy

@ARTICLE{e93-c_3_407,

author={Cao LIANG, Xinming HUANG, },

journal={IEICE TRANSACTIONS on Electronics},

title={Mapping Parallel FFT Algorithm onto SmartCell Coarse-Grained Reconfigurable Architecture},

year={2010},

volume={E93-C},

number={3},

pages={407-415},

abstract={Fast Fourier Transform (FFT) is an important algorithm in many digital signal processing applications, and it often requires parallel implementation for high throughput. In this paper, we first present the SmartCell coarse-grained reconfigurable architecture targeted for stream processing. A SmartCell prototype integrates 64 processing elements, configurable interconnections, and dedicated instruction and data memories into a single chip, which is able to provide high performance parallel processing while maintaining post-fabrication flexibility. Subsequently, we present a parallel FFT architecture targeted for multi-core platforms computing systems. This algorithm provides an optimized data flow pattern that reduces both communication and configuration overheads. The proposed parallel FFT algorithm is then mapped onto the SmartCell prototype device. Results show that the parallel FFT implementation on SmartCell is about 14.9 and 2.7 times faster than network-on-chip (NoC) and MorphoSys implementations, respectively. SmartCell also achieves the energy efficiency gains of 2.1 and 28.9 when compared with FPGA and DSP implementations.},

keywords={},

doi={10.1587/transele.E93.C.407},

ISSN={1745-1353},

month={March},}

Copy

TY - JOUR

TI - Mapping Parallel FFT Algorithm onto SmartCell Coarse-Grained Reconfigurable Architecture

T2 - IEICE TRANSACTIONS on Electronics

SP - 407

EP - 415

AU - Cao LIANG

AU - Xinming HUANG

PY - 2010

DO - 10.1587/transele.E93.C.407

JO - IEICE TRANSACTIONS on Electronics

SN - 1745-1353

VL - E93-C

IS - 3

JA - IEICE TRANSACTIONS on Electronics

Y1 - March 2010

AB - Fast Fourier Transform (FFT) is an important algorithm in many digital signal processing applications, and it often requires parallel implementation for high throughput. In this paper, we first present the SmartCell coarse-grained reconfigurable architecture targeted for stream processing. A SmartCell prototype integrates 64 processing elements, configurable interconnections, and dedicated instruction and data memories into a single chip, which is able to provide high performance parallel processing while maintaining post-fabrication flexibility. Subsequently, we present a parallel FFT architecture targeted for multi-core platforms computing systems. This algorithm provides an optimized data flow pattern that reduces both communication and configuration overheads. The proposed parallel FFT algorithm is then mapped onto the SmartCell prototype device. Results show that the parallel FFT implementation on SmartCell is about 14.9 and 2.7 times faster than network-on-chip (NoC) and MorphoSys implementations, respectively. SmartCell also achieves the energy efficiency gains of 2.1 and 28.9 when compared with FPGA and DSP implementations.

ER -