The search functionality is under construction.

Author Search Result

[Author] Akram BEN AHMED(4hit)

1-4hit
  • A Multi-Tenant Resource Management System for Multi-FPGA Systems

    Miho YAMAKURA  Ryousei TAKANO  Akram BEN AHMED  Midori SUGAYA  Hideharu AMANO  

     
    PAPER

      Pubricized:
    2021/10/08
      Vol:
    E104-D No:12
      Page(s):
    2078-2088

    FPGA (Field Programmable Gate Array) based accelerators are attracting significant interest in cloud computing systems. Combining multi-FPGA systems with cloud computing brings a new perspective to the reconfigurable computing research. However, the multi-tenancy of a multi-FPGA system has not been fully discussed in the previous researches. In this paper, we propose a multi-tenant resource management system, named FiC-RM, for a multi-FPGA cloud system. FiC-RM provides users with a set of FPGA resources according to their requirements and allows them to exclusively access FPGA boards and the interconnection network. To achieve this, we propose a placement algorithm which is a key to efficiently share the limited resources. We demonstrate FiC-RM controls a practical scale multi-FPGA system. Moreover, Our simulation study shows that our placement algorithm achieved 3 to 4% improvement in the average resource usage and a 20-second reduction in the response time, compared to other existing naive algorithms.

  • Remote Dynamic Reconfiguration of a Multi-FPGA System FiC (Flow-in-Cloud)

    Kazuei HIRONAKA  Kensuke IIZUKA  Miho YAMAKURA  Akram BEN AHMED  Hideharu AMANO  

     
    PAPER-Computer System

      Pubricized:
    2021/05/12
      Vol:
    E104-D No:8
      Page(s):
    1321-1331

    Multi-FPGA systems have been receiving a lot of attention as a low cost and energy efficient system for Multi-access Edge Computing (MEC). For such purpose, a bare-metal multi-FPGA system called FiC (Flow-in-Cloud) is under development. In this paper, we introduce the FiC multi FPGA cluster which is applied partial reconfiguration (PR) FPGA design flow to support online user defined accelerator replacement while executing FPGA interconnection network and its low-level multiple FPGA management software called remote PR manager. With the remote PR manager, the user can define the FiC FPGA cluster setup by JSON and control the cluster from user application with the cooperation of simple cluster management tool / library called ficmgr on the client host and REST API service provider called ficwww on Raspberry Pi 3 (RPi3) on each node. According to the evaluation results with a prototype FiC FPGA cluster system with 12 nodes, using with online application replacement by PR and on-the-fly FPGA bitstream compression, the time for FPGA bitstream distribution was reduced to 1/17 and the total cluster setup time was reduced by 21∼57% than compared to cluster setup with full configuration FPGA bitstream.

  • A Multi-FPGA Implementation of FM-Index Based Genomic Pattern Search

    Ullah IMDAD  Akram BEN AHMED  Kazuei HIRONAKA  Kensuke IIZUKA  Hideharu AMANO  

     
    PAPER-Computer System

      Pubricized:
    2023/08/09
      Vol:
    E106-D No:11
      Page(s):
    1783-1795

    FPGA clusters that consist of multiple FPGA boards have been gaining interest in recent times. Massively parallel processing with a stand-alone heterogeneous FPGA cluster with SoC- style FPGAs and mid-scale FPGAs is promising with cost-performance benefit. Here, we propose such a heterogeneous FPGA cluster with FiC and M-KUBOS cluster. FiC consists of multiple boards, mounting middle scale Xilinx's FPGAs and DRAMs, which are tightly coupled with high-speed serial links. In addition, M-KUBOS boards are connected to FiC for ensuring high IO data transfer bandwidth. As an example of massively parallel processing, here we implement genomic pattern search. Next-generation sequencing (NGS) technology has revolutionized biological system related research by its high-speed, scalable and massive throughput. To analyze the genomic data, short read mapping technique is used where short Deoxyribonucleic acid (DNA) sequences are mapped relative to a known reference sequence. Although several pattern matching techniques are available, FM-index based pattern search is perfectly suitable for this task due to the fastest mapping from known indices. Since matching can be done in parallel for different data, the massively parallel computing which distributes data, executes in parallel and gathers the results can be applied. We also implement a data compression method where about 10 times reduction in data size is achieved. We found that a M-KUBOS board matches four FiC boards, and a system with six M-KUBOS boards and 24 FiC boards achieved 30 times faster than the software based implementation.

  • Multi-Voltage Variable Pipeline Routers with the Same Clock Frequency for Low-Power Network-on-Chips Systems

    Akram BEN AHMED  Hiroki MATSUTANI  Michihiro KOIBUCHI  Kimiyoshi USAMI  Hideharu AMANO  

     
    PAPER

      Vol:
    E99-C No:8
      Page(s):
    909-917

    In this paper, the Multi-voltage (multi-Vdd) variable pipeline router is proposed to reduce the power consumption of Network-on-Chips (NoCs) designed for Chip Multi-processors (CMPs). The multi-Vdd variable pipeline router adjusts its pipeline depth (i.e., communication latency) and supply voltage level in response to the applied workload. Unlike Dynamic Voltage and Frequency Scaling (DVFS) routers, the operating frequency remains the same for all routers throughout the CMP; thus, omitting the need to synchronize neighboring routers working at different frequencies. Two types of router architectures are presented: a Coarse-Grained Variable Pipeline (CG-VP) router that changes the voltage supplied to the entire router, and a Fine-Grained Variable Pipeline (FG-VP) router that uses a finer power partition. The evaluation results showed that the CG-VP and FG-VP routers achieve a 22.9% and 35.3% power reduction on average with 14% and 23% area overhead in comparison with a baseline router without variable pipelines, respectively. Thanks to the adopted look-ahead mechanism to switch the supply voltage, the performance overhead is only 4.4%.