The search functionality is under construction.

Author Search Result

[Author] Hiroki HONDA(3hit)

1-3hit
  • A Fortran Parallelizing Compilation Scheme for OSCAR Using Dependence Graph Analysis

    Hironori KASAHARA  Hiroki HONDA  Seinosuke NARITA  

     
    INVITED PAPER

      Vol:
    E74-A No:10
      Page(s):
    3105-3114

    This paper proposes a Fortran parallelizing compilation scheme for a multiprocessor system named OSCAR. The scheme hierarchically exploits parallelism among coarse grain tasks, such as, loops, subroutines or basic blocks, among medium grain tasks like loop iterations and among near fine grain tasks like statements. Parallelism among the coarse grain tasks called the macrotasks is detected by analyzing a macro-flow graph which explicitly represents control flow and data dependences. The detected parallelism among the macrotasks is represented by a directed acyclic graph called a macrotask graph. Macrotasks in a macrotask graph are dynamically assigned to processor clusters to cope with run-time uncertainties. A macrotask composed of a Do-all loop or a Do-across loop, which is assigned onto a processor cluster, is processed in the medium grain in parallel by processors inside the processor cluster. A macrotask composed of a basic block is processed on a processor cluster in the near fine grain by using static scheduling. A macrotask composed of subroutine or a large sequential loop is processed by hierarchically applying macro-dataflow computation inside a processor cluster. Performance of the proposed scheme is evaluated on OSCAR. The evaluation shows that the hierarchical parallel processing scheme using dynamic and static scheduling effectively exploits parallelism from Fortran programs.

  • RPC: An Approach for Reducing Compulsory Misses in Packet Processing Cache

    Hayato YAMAKI  Hiroaki NISHI  Shinobu MIWA  Hiroki HONDA  

     
    PAPER-Information Network

      Pubricized:
    2020/09/07
      Vol:
    E103-D No:12
      Page(s):
    2590-2599

    We propose a technique to reduce compulsory misses of packet processing cache (PPC), which largely affects both throughput and energy of core routers. Rather than prefetching data, our technique called response prediction cache (RPC) speculatively stores predicted data in PPC without additional access to the low-throughput and power-consuming memory (i.e., TCAM). RPC predicts the data related to a response flow at the arrival of the corresponding request flow, based on the request-response model of internet communications. Our experimental results with 11 real-network traces show that RPC can reduce the PPC miss rate by 13.4% in upstream and 47.6% in downstream on average when we suppose three-layer PPC. Moreover, we extend RPC to adaptive RPC (A-RPC) that selects the use of RPC in each direction within a core router for further improvement in PPC misses. Finally, we show that A-RPC can achieve 1.38x table-lookup throughput with 74% energy consumption per packet, when compared to conventional PPC.

  • Development and Implementation of an Interactive Parallelization Assistance Tool for OpenMP: iPat/OMP

    Makoto ISHIHARA  Hiroki HONDA  Mitsuhisa SATO  

     
    PAPER-Parallel/Distributed Programming Models, Paradigms and Tools

      Vol:
    E89-D No:2
      Page(s):
    399-407

    iPat/OMP is an interactive parallelization assistance tool for OpenMP. In the present paper, we describe the design concept of iPat/OMP, the parallelization sequence achieved by the tool and its current implementation status. In addition, we present an evaluation of the performance of the implemented functionalities. The experimental results show that iPat/OMP can detect parallelism and create an appropriate OpenMP directive for several for-loops.