Multi-FPGA systems have gained attention because of their high performance and power efficiency. A multi-FPGA system called Flow-in-Cloud (FiC) is currently being developed as an accelerator of multi-access edge computing (MEC). FiC consists of multiple mid-range FPGAs tightly connected by high-speed serial links. Since time-critical jobs are assumed in MEC, a circuit-switched network with static time-division multiplexing (STDM) switches has been implemented on FiC. This paper investigates techniques of enhancing the interconnection performance of FiC. Unlike switching fabrics for Network on Chips or parallel machines, economical multi-FPGA systems, such as FiC, use Xilinx Aurora IP and FireFly cables with multiple lanes. We adopted the link aggregation and the slot distribution for using multiple lanes. To mitigate the bottleneck between an STDM switch and user logic, we also propose a multi-ejection STDM switch. We evaluated various combinations of our techniques by using three practical applications on an FiC prototype with 24 boards. When the number of slots is large and transferred data size is small, the slot distribution was sometimes more effective, while the link aggregation was superior for other most cases. Our multi-ejection STDM switch mitigated the bottleneck in ejection ports and successfully reduced the number of time slots. As a result, by combining the link aggregation and multi-ejection STDM switch, communication performance improved up to 7.50 times with few additional resources. Although the performance of the fast Fourier transform with the highest communication ratio could not be enhanced by using multiple boards when a lane was used, 1.99 times performance improvement was achieved by using 8 boards with four lanes and our multi-ejection switch compared with a board.
Kohei ITO
Keio University
Kensuke IIZUKA
Keio University
Kazuei HIRONAKA
Keio University
Yao HU
National Institute of Informatics
Michihiro KOIBUCHI
National Institute of Informatics
Hideharu AMANO
Keio University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Kohei ITO, Kensuke IIZUKA, Kazuei HIRONAKA, Yao HU, Michihiro KOIBUCHI, Hideharu AMANO, "Improving the Performance of Circuit-Switched Interconnection Network for a Multi-FPGA System" in IEICE TRANSACTIONS on Information,
vol. E104-D, no. 12, pp. 2029-2039, December 2021, doi: 10.1587/transinf.2021PAP0002.
Abstract: Multi-FPGA systems have gained attention because of their high performance and power efficiency. A multi-FPGA system called Flow-in-Cloud (FiC) is currently being developed as an accelerator of multi-access edge computing (MEC). FiC consists of multiple mid-range FPGAs tightly connected by high-speed serial links. Since time-critical jobs are assumed in MEC, a circuit-switched network with static time-division multiplexing (STDM) switches has been implemented on FiC. This paper investigates techniques of enhancing the interconnection performance of FiC. Unlike switching fabrics for Network on Chips or parallel machines, economical multi-FPGA systems, such as FiC, use Xilinx Aurora IP and FireFly cables with multiple lanes. We adopted the link aggregation and the slot distribution for using multiple lanes. To mitigate the bottleneck between an STDM switch and user logic, we also propose a multi-ejection STDM switch. We evaluated various combinations of our techniques by using three practical applications on an FiC prototype with 24 boards. When the number of slots is large and transferred data size is small, the slot distribution was sometimes more effective, while the link aggregation was superior for other most cases. Our multi-ejection STDM switch mitigated the bottleneck in ejection ports and successfully reduced the number of time slots. As a result, by combining the link aggregation and multi-ejection STDM switch, communication performance improved up to 7.50 times with few additional resources. Although the performance of the fast Fourier transform with the highest communication ratio could not be enhanced by using multiple boards when a lane was used, 1.99 times performance improvement was achieved by using 8 boards with four lanes and our multi-ejection switch compared with a board.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2021PAP0002/_p
Copy
@ARTICLE{e104-d_12_2029,
author={Kohei ITO, Kensuke IIZUKA, Kazuei HIRONAKA, Yao HU, Michihiro KOIBUCHI, Hideharu AMANO, },
journal={IEICE TRANSACTIONS on Information},
title={Improving the Performance of Circuit-Switched Interconnection Network for a Multi-FPGA System},
year={2021},
volume={E104-D},
number={12},
pages={2029-2039},
abstract={Multi-FPGA systems have gained attention because of their high performance and power efficiency. A multi-FPGA system called Flow-in-Cloud (FiC) is currently being developed as an accelerator of multi-access edge computing (MEC). FiC consists of multiple mid-range FPGAs tightly connected by high-speed serial links. Since time-critical jobs are assumed in MEC, a circuit-switched network with static time-division multiplexing (STDM) switches has been implemented on FiC. This paper investigates techniques of enhancing the interconnection performance of FiC. Unlike switching fabrics for Network on Chips or parallel machines, economical multi-FPGA systems, such as FiC, use Xilinx Aurora IP and FireFly cables with multiple lanes. We adopted the link aggregation and the slot distribution for using multiple lanes. To mitigate the bottleneck between an STDM switch and user logic, we also propose a multi-ejection STDM switch. We evaluated various combinations of our techniques by using three practical applications on an FiC prototype with 24 boards. When the number of slots is large and transferred data size is small, the slot distribution was sometimes more effective, while the link aggregation was superior for other most cases. Our multi-ejection STDM switch mitigated the bottleneck in ejection ports and successfully reduced the number of time slots. As a result, by combining the link aggregation and multi-ejection STDM switch, communication performance improved up to 7.50 times with few additional resources. Although the performance of the fast Fourier transform with the highest communication ratio could not be enhanced by using multiple boards when a lane was used, 1.99 times performance improvement was achieved by using 8 boards with four lanes and our multi-ejection switch compared with a board.},
keywords={},
doi={10.1587/transinf.2021PAP0002},
ISSN={1745-1361},
month={December},}
Copy
TY - JOUR
TI - Improving the Performance of Circuit-Switched Interconnection Network for a Multi-FPGA System
T2 - IEICE TRANSACTIONS on Information
SP - 2029
EP - 2039
AU - Kohei ITO
AU - Kensuke IIZUKA
AU - Kazuei HIRONAKA
AU - Yao HU
AU - Michihiro KOIBUCHI
AU - Hideharu AMANO
PY - 2021
DO - 10.1587/transinf.2021PAP0002
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E104-D
IS - 12
JA - IEICE TRANSACTIONS on Information
Y1 - December 2021
AB - Multi-FPGA systems have gained attention because of their high performance and power efficiency. A multi-FPGA system called Flow-in-Cloud (FiC) is currently being developed as an accelerator of multi-access edge computing (MEC). FiC consists of multiple mid-range FPGAs tightly connected by high-speed serial links. Since time-critical jobs are assumed in MEC, a circuit-switched network with static time-division multiplexing (STDM) switches has been implemented on FiC. This paper investigates techniques of enhancing the interconnection performance of FiC. Unlike switching fabrics for Network on Chips or parallel machines, economical multi-FPGA systems, such as FiC, use Xilinx Aurora IP and FireFly cables with multiple lanes. We adopted the link aggregation and the slot distribution for using multiple lanes. To mitigate the bottleneck between an STDM switch and user logic, we also propose a multi-ejection STDM switch. We evaluated various combinations of our techniques by using three practical applications on an FiC prototype with 24 boards. When the number of slots is large and transferred data size is small, the slot distribution was sometimes more effective, while the link aggregation was superior for other most cases. Our multi-ejection STDM switch mitigated the bottleneck in ejection ports and successfully reduced the number of time slots. As a result, by combining the link aggregation and multi-ejection STDM switch, communication performance improved up to 7.50 times with few additional resources. Although the performance of the fast Fourier transform with the highest communication ratio could not be enhanced by using multiple boards when a lane was used, 1.99 times performance improvement was achieved by using 8 boards with four lanes and our multi-ejection switch compared with a board.
ER -