The search functionality is under construction.

Keyword Search Result

[Keyword] task mapping(4hit)

1-4hit
  • Real-Time and Energy-Efficient Face Detection on CPU-GPU Heterogeneous Embedded Platforms

    Chanyoung OH  Saehanseul YI  Youngmin YI  

     
    PAPER-Real-time Systems

      Pubricized:
    2018/09/18
      Vol:
    E101-D No:12
      Page(s):
    2878-2888

    As energy efficiency has become a major design constraint or objective, heterogeneous manycore architectures have emerged as mainstream target platforms not only in server systems but also in embedded systems. Manycore accelerators such as GPUs are getting also popular in embedded domains, as well as the heterogeneous CPU cores. However, as the number of cores in an embedded GPU is far less than that of a server GPU, it is important to utilize both heterogeneous multi-core CPUs and GPUs to achieve the desired throughput with the minimal energy consumption. In this paper, we present a case study of mapping LBP-based face detection onto a recent CPU-GPU heterogeneous embedded platform, which exploits both task parallelism and data parallelism to achieve maximal energy efficiency with a real-time constraint. We first present the parallelization technique of each task for the GPU execution, then we propose performance and energy models for both task-parallel and data-parallel executions on heterogeneous processors, which are used in design space exploration for the optimal mapping. The design space is huge since not only processor heterogeneity such as CPU-GPU and big.LITTLE, but also various data partitioning ratios for the data-parallel execution on these heterogeneous processors are considered. In our case study of LBP face detection on Exynos 5422, the estimation error of the proposed performance and energy models were on average -2.19% and -3.67% respectively. By systematically finding the optimal mappings with the proposed models, we could achieve 28.6% less energy consumption compared to the manual mapping, while still meeting the real-time constraint.

  • Static Mapping of Parallelizable Tasks under Deadline Constraints

    Yining XU  Ittetsu TANIGUCHI  Hiroyuki TOMIYAMA  

     
    LETTER

      Vol:
    E100-A No:7
      Page(s):
    1500-1502

    Task mapping is one of the most important design processes in embedded manycore systems. This paper proposes a static task mapping technique for manycore real-time systems. The technique minimizes the number of cores while satisfying deadline constraints of individual tasks.

  • ILP Based Multithreaded Code Generation for Simulink Model

    Kai HUANG  Min YU  Xiaomeng ZHANG  Dandan ZHENG  Siwen XIU  Rongjie YAN  Kai HUANG  Zhili LIU  Xiaolang YAN  

     
    PAPER-Architecture

      Vol:
    E97-D No:12
      Page(s):
    3072-3082

    The increasing complexity of embedded applications and the prevalence of multiprocessor system-on-chip (MPSoC) introduce a great challenge for designers on how to achieve performance and programmability simultaneously in embedded systems. Automatic multithreaded code generation methods taking account of performance optimization techniques can be an effective solution. In this paper, we consider the issue of increasing processor utilization and reducing communication cost during multithreaded code generation from Simulink models to improve system performance. We propose a combination of three-layered multithreaded software with Integer Linear Programming (ILP) based design-time mapping and scheduling policies to get optimal performance. The hierarchical software with a thread layer increases processor usage, while the mapping and scheduling policies formulate a group of integer linear programming formulations to minimize communication cost as well as to maximize performance. Experimental results demonstrate the advantages of the proposed techniques on performance improvements.

  • Battery-Aware Task Mapping for Coarse-Grained Reconfigurable Architecture

    Shouyi YIN  Rui SHI  Leibo LIU  Shaojun WEI  

     
    PAPER

      Vol:
    E96-D No:12
      Page(s):
    2524-2535

    Coarse-grained Reconfigurable Architecture (CGRA) is a parallel computing platform that provides both high performance of hardware and high flexibility of software. It is becoming a promising platform for embedded and mobile applications. Since the embedded and mobile devices are usually battery-powered, improving battery lifetime becomes one of the primary design issues in using CGRAs. In this paper, we propose a battery-aware task-mapping method to optimize energy consumption and improve battery lifetime. The proposed method mainly addresses two problems: task partitioning and task scheduling when mapping applications onto CGRA. The task partitioning and scheduling are formulated as a joint optimization problem of minimizing the energy consumption. The nonlinear effects of real battery are taken into account in problem formulation. Using the insights from the problem formulation, we design the task-mapping algorithm. We have used several real-world benchmarks to test the effectiveness of the proposed method. Experiment results show that our method can dramatically lower the energy consumption and prolong the battery-life.