The search functionality is under construction.

IEICE TRANSACTIONS on Information

A New Efficient Resource Management Framework for Iterative MapReduce Processing in Large-Scale Data Analysis

Seungtae HONG, Kyongseok PARK, Chae-Deok LIM, Jae-Woo CHANG

  • Full Text Views

    0

  • Cite this
Errata[Uploaded on March 1,2018]

Summary :

To analyze large-scale data efficiently, studies on Hadoop, one of the most popular MapReduce frameworks, have been actively done. Meanwhile, most of the large-scale data analysis applications, e.g., data clustering, are required to do the same map and reduce functions repeatedly. However, Hadoop cannot provide an optimal performance for iterative MapReduce jobs because it derives a result by doing one phase of map and reduce functions. To solve the problems, in this paper, we propose a new efficient resource management framework for iterative MapReduce processing in large-scale data analysis. For this, we first design an iterative job state-machine for managing the iterative MapReduce jobs. Secondly, we propose an invariant data caching mechanism for reducing the I/O costs of data accesses. Thirdly, we propose an iterative resource management technique for efficiently managing the resources of a Hadoop cluster. Fourthly, we devise a stop condition check mechanism for preventing unnecessary computation. Finally, we show the performance superiority of the proposed framework by comparing it with the existing frameworks.

Publication
IEICE TRANSACTIONS on Information Vol.E100-D No.4 pp.704-717
Publication Date
2017/04/01
Publicized
2017/01/17
Online ISSN
1745-1361
DOI
10.1587/transinf.2016DAP0013
Type of Manuscript
Special Section PAPER (Special Section on Data Engineering and Information Management)
Category

Authors

Seungtae HONG
  Electronics and Telecommunications Research Institute
Kyongseok PARK
  Korea Institute of Science and Technology Information
Chae-Deok LIM
  Electronics and Telecommunications Research Institute
Jae-Woo CHANG
  Chonbuk National University

Keyword