Cloud computing is a widely used computing platform in business and academic communities. Performance is an important issue when a user runs an application in the cloud. The user may want to estimate the application-execution time beforehand to guarantee the application performance or to choose the most suitable cloud. Moreover, the cloud system architect and the designer need to understand the application performance characteristics, such as the scalability or the utilization of cloud platforms, to improve performance. However, because the application performance in clouds sometime fluctuates, estimation of the application performance is difficult. In this paper, we discuss the performance fluctuation of Hadoop jobs in both a public cloud and a community cloud for one to three months. The experimental results indicate phenomena that we cannot see without long-term experiments and phenomena inherent in Hadoop. The results suggest better ways to estimate Hadoop application performances in clouds. For example, we should be aware of application characteristics (CPU intensive or communication intensive), datacenter characteristics (busy or not), and time frame (time of day and day of the week) to estimate the performance fluctuation due to workload congestion in cloud platforms. Furthermore, we should be aware of performance degradation due to task re-execution in Hadoop applications.
Kento AIDA
National Institute of Informatics
Omar ABDUL-RAHMAN
National Institute of Informatics
Eisaku SAKANE
National Institute of Informatics
Kazutaka MOTOYAMA
National Institute of Informatics
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Kento AIDA, Omar ABDUL-RAHMAN, Eisaku SAKANE, Kazutaka MOTOYAMA, "Long-Term Performance Evaluation of Hadoop Jobs in Public and Community Clouds" in IEICE TRANSACTIONS on Information,
vol. E98-D, no. 6, pp. 1176-1184, June 2015, doi: 10.1587/transinf.2014EDP7274.
Abstract: Cloud computing is a widely used computing platform in business and academic communities. Performance is an important issue when a user runs an application in the cloud. The user may want to estimate the application-execution time beforehand to guarantee the application performance or to choose the most suitable cloud. Moreover, the cloud system architect and the designer need to understand the application performance characteristics, such as the scalability or the utilization of cloud platforms, to improve performance. However, because the application performance in clouds sometime fluctuates, estimation of the application performance is difficult. In this paper, we discuss the performance fluctuation of Hadoop jobs in both a public cloud and a community cloud for one to three months. The experimental results indicate phenomena that we cannot see without long-term experiments and phenomena inherent in Hadoop. The results suggest better ways to estimate Hadoop application performances in clouds. For example, we should be aware of application characteristics (CPU intensive or communication intensive), datacenter characteristics (busy or not), and time frame (time of day and day of the week) to estimate the performance fluctuation due to workload congestion in cloud platforms. Furthermore, we should be aware of performance degradation due to task re-execution in Hadoop applications.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2014EDP7274/_p
Copy
@ARTICLE{e98-d_6_1176,
author={Kento AIDA, Omar ABDUL-RAHMAN, Eisaku SAKANE, Kazutaka MOTOYAMA, },
journal={IEICE TRANSACTIONS on Information},
title={Long-Term Performance Evaluation of Hadoop Jobs in Public and Community Clouds},
year={2015},
volume={E98-D},
number={6},
pages={1176-1184},
abstract={Cloud computing is a widely used computing platform in business and academic communities. Performance is an important issue when a user runs an application in the cloud. The user may want to estimate the application-execution time beforehand to guarantee the application performance or to choose the most suitable cloud. Moreover, the cloud system architect and the designer need to understand the application performance characteristics, such as the scalability or the utilization of cloud platforms, to improve performance. However, because the application performance in clouds sometime fluctuates, estimation of the application performance is difficult. In this paper, we discuss the performance fluctuation of Hadoop jobs in both a public cloud and a community cloud for one to three months. The experimental results indicate phenomena that we cannot see without long-term experiments and phenomena inherent in Hadoop. The results suggest better ways to estimate Hadoop application performances in clouds. For example, we should be aware of application characteristics (CPU intensive or communication intensive), datacenter characteristics (busy or not), and time frame (time of day and day of the week) to estimate the performance fluctuation due to workload congestion in cloud platforms. Furthermore, we should be aware of performance degradation due to task re-execution in Hadoop applications.},
keywords={},
doi={10.1587/transinf.2014EDP7274},
ISSN={1745-1361},
month={June},}
Copy
TY - JOUR
TI - Long-Term Performance Evaluation of Hadoop Jobs in Public and Community Clouds
T2 - IEICE TRANSACTIONS on Information
SP - 1176
EP - 1184
AU - Kento AIDA
AU - Omar ABDUL-RAHMAN
AU - Eisaku SAKANE
AU - Kazutaka MOTOYAMA
PY - 2015
DO - 10.1587/transinf.2014EDP7274
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E98-D
IS - 6
JA - IEICE TRANSACTIONS on Information
Y1 - June 2015
AB - Cloud computing is a widely used computing platform in business and academic communities. Performance is an important issue when a user runs an application in the cloud. The user may want to estimate the application-execution time beforehand to guarantee the application performance or to choose the most suitable cloud. Moreover, the cloud system architect and the designer need to understand the application performance characteristics, such as the scalability or the utilization of cloud platforms, to improve performance. However, because the application performance in clouds sometime fluctuates, estimation of the application performance is difficult. In this paper, we discuss the performance fluctuation of Hadoop jobs in both a public cloud and a community cloud for one to three months. The experimental results indicate phenomena that we cannot see without long-term experiments and phenomena inherent in Hadoop. The results suggest better ways to estimate Hadoop application performances in clouds. For example, we should be aware of application characteristics (CPU intensive or communication intensive), datacenter characteristics (busy or not), and time frame (time of day and day of the week) to estimate the performance fluctuation due to workload congestion in cloud platforms. Furthermore, we should be aware of performance degradation due to task re-execution in Hadoop applications.
ER -