The search functionality is under construction.

Keyword Search Result

[Keyword] data locality(4hit)

1-4hit
  • A ReRAM-Based Row-Column-Oriented Memory Architecture for Convolutional Neural Networks

    Yan CHEN  Jing ZHANG  Yuebing XU  Yingjie ZHANG  Renyuan ZHANG  Yasuhiko NAKASHIMA  

     
    BRIEF PAPER

      Vol:
    E102-C No:7
      Page(s):
    580-584

    An efficient resistive random access memory (ReRAM) structure is developed for accelerating convolutional neural network (CNN) powered by the in-memory computation. A novel ReRAM cell circuit is designed with two-directional (2-D) accessibility. The entire memory system is organized as a 2-D array, in which specific memory cells can be identically accessed by both of column- and row-locality. For the in-memory computations of CNNs, only relevant cells in an identical sub-array are accessed by 2-D read-out operations, which is hardly implemented by conventional ReRAM cells. In this manner, the redundant access (column or row) of the conventional ReRAM structures is prevented to eliminated the unnecessary data movement when CNNs are processed in-memory. From the simulation results, the energy and bandwidth efficiency of the proposed memory structure are 1.4x and 5x of a state-of-the-art ReRAM architecture, respectively.

  • Naive Bayes Classifier Based Partitioner for MapReduce

    Lei CHEN  Wei LU  Ergude BAO  Liqiang WANG  Weiwei XING  Yuanyuan CAI  

     
    PAPER-Graphs and Networks

      Vol:
    E101-A No:5
      Page(s):
    778-786

    MapReduce is an effective framework for processing large datasets in parallel over a cluster. Data locality and data skew on the reduce side are two essential issues in MapReduce. Improving data locality can decrease network traffic by moving reduce tasks to the nodes where the reducer input data is located. Data skew will lead to load imbalance among reducer nodes. Partitioning is an important feature of MapReduce because it determines the reducer nodes to which map output results will be sent. Therefore, an effective partitioner can improve MapReduce performance by increasing data locality and decreasing data skew on the reduce side. Previous studies considering both essential issues can be divided into two categories: those that preferentially improve data locality, such as LEEN, and those that preferentially improve load balance, such as CLP. However, all these studies ignore the fact that for different types of jobs, the priority of data locality and data skew on the reduce side may produce different effects on the execution time. In this paper, we propose a naive Bayes classifier based partitioner, namely, BAPM, which achieves better performance because it can automatically choose the proper algorithm (LEEN or CLP) by leveraging the naive Bayes classifier, i.e., considering job type and bandwidth as classification attributes. Our experiments are performed in a Hadoop cluster, and the results show that BAPM boosts the computing performance of MapReduce. The selection accuracy reaches 95.15%. Further, compared with other popular algorithms, under specific bandwidths, the improvement BAPM achieved is up to 31.31%.

  • 2PTS: A Two-Phase Task Scheduling Algorithm for MapReduce

    Byungnam LIM  Yeeun SHIM  Yon Dohn CHUNG  

     
    LETTER-Fundamentals of Information Systems

      Pubricized:
    2016/06/06
      Vol:
    E99-D No:9
      Page(s):
    2377-2380

    For an efficient processing of large data in a distributed system, Hadoop MapReduce performs task scheduling such that tasks are distributed with consideration of the data locality. The data locality, however, is limitedly exploited, since it is pursued one node at a time basis without considering the global optimality. In this paper, we propose a novel task scheduling algorithm that globally considers the data locality. Through experiments, we show our algorithm improves the performance of MapReduce in various situations.

  • An Efficiency-Aware Scheduling for Data-Intensive Computations on MapReduce Clusters

    Hui ZHAO  Shuqiang YANG  Hua FAN  Zhikun CHEN  Jinghu XU  

     
    PAPER

      Vol:
    E96-D No:12
      Page(s):
    2654-2662

    Scheduling plays a key role in MapReduce systems. In this paper, we explore the efficiency of an MapReduce cluster running lots of independent and continuously arriving MapReduce jobs. Data locality and load balancing are two important factors to improve computation efficiency in MapReduce systems for data-intensive computations. Traditional cluster scheduling technologies are not well suitable for MapReduce environment, there are some in-used schedulers for the popular open-source Hadoop MapReduce implementation, however, they can not well optimize both factors. Our main objective is to minimize total flowtime of all jobs, given it's a strong NP-hard problem, we adopt some effective heuristics to seek satisfied solution. In this paper, we formalize the scheduling problem as job selection problem, a load balance aware job selection algorithm is proposed, in task level we design a strict data locality tasks scheduling algorithm for map tasks on map machines and a load balance aware scheduling algorithm for reduce tasks on reduce machines. Comprehensive experiments have been conducted to compare our scheduling strategy with well-known Hadoop scheduling strategies. The experimental results validate the efficiency of our proposed scheduling strategy.