IEICE global.ieice.org Site

Keyword Search Result

[Keyword] data decomposition(2hit)

1-2hit

Local Memory Mapping of Multicore Processors on an Automatic Parallelizing Compiler
Yoshitake OKI Yuto ABE Kazuki YAMAMOTO Kohei YAMAMOTO Tomoya SHIRAKAWA Akimasa YOSHIDA Keiji KIMURA Hironori KASAHARA

PAPER

Vol:
E103-C No:3
Page(s):
98-109
Utilization of local memory from real-time embedded systems to high performance systems with multi-core processors has become an important factor for satisfying hard deadline constraints. However, challenges lie in the area of efficiently managing the memory hierarchy, such as decomposing large data into small blocks to fit onto local memory and transferring blocks for reuse and replacement. To address this issue, this paper presents a compiler optimization method that automatically manage local memory of multi-core processors. The method selects and maps multi-dimensional data onto software specified memory blocks called Adjustable Blocks. These blocks are hierarchically divisible with varying sizes defined by the features of the input application. Moreover, the method introduces mapping structures called Template Arrays to maintain the indices of the decomposed multi-dimensional data. The proposed work is implemented on the OSCAR automatic parallelizing compiler and evaluations were performed on the Renesas RP2 8-core processor. Experimental results from NAS Parallel Benchmark, SPEC benchmark, and multimedia applications show the effectiveness of the method, obtaining maximum speed-ups of 20.44 with 8 cores utilizing local memory from single core sequential versions that use off-chip memory.
Data-Localization Scheduling inside Processor-Cluster for Multigrain Parallel Processing
Akimasa YOSHIDA Ken'ichi KOSHIZUKA Wataru OGATA Hironori KASAHARA

PAPER

Vol:
E80-D No:4
Page(s):
473-479
This paper proposes a data-localization scheduling scheme inside a processor-cluster for multigrain parallel processing, which hierarchically exploits parallelism among coarsegrain tasks like loops, medium-grain tasks like loop iterations and near-fine-grain tasks like statements. The proposed scheme assigns near-fine-grain or medium-grain tasks inside coarse-grain tasks onto processors inside a processor-cluster so that maximum parallelism can be exploited and inter-processor data transfer can be minimum after data-localization for coarse-grain tasks across processor-clusters. Performance evaluation on a multiprocessor system OSCAR shows that multigrain parallel processing with the proposed data-localization scheduling can reduce execution time for application programs by 10% compared with multigrain parallel processing without data-localization.

Keyword Search Result

[Keyword] data decomposition(2hit)

Local Memory Mapping of Multicore Processors on an Automatic Parallelizing Compiler

Data-Localization Scheduling inside Processor-Cluster for Multigrain Parallel Processing

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles