This paper introduces e-spill, an eager spill mechanism, which dynamically finds the optimal spill-threshold by monitoring the GC time at runtime and thereby prevent expensive GC overhead. Our e-spill adopts a slow-start model to gradually increase the spill-threshold until it reaches the optimal point without substantial GCs. We prototype e-spill as an extension to Spark and evaluate it using six workloads on three different parallel platforms. Our evaluations show that e-spill improves performance by up to 3.80× and saves the cost of cluster operation on Amazon EC2 cloud by up to 51% over the baseline system following Spark Tuning Guidelines.
Hakbeom JANG
Sungkyunkwan University
Jonghyun BAE
Seoul National University
Tae Jun HAM
Seoul National University
Jae W. LEE
Seoul National University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Hakbeom JANG, Jonghyun BAE, Tae Jun HAM, Jae W. LEE, "Eager Memory Management for In-Memory Data Analytics" in IEICE TRANSACTIONS on Information,
vol. E102-D, no. 3, pp. 632-636, March 2019, doi: 10.1587/transinf.2018EDL8199.
Abstract: This paper introduces e-spill, an eager spill mechanism, which dynamically finds the optimal spill-threshold by monitoring the GC time at runtime and thereby prevent expensive GC overhead. Our e-spill adopts a slow-start model to gradually increase the spill-threshold until it reaches the optimal point without substantial GCs. We prototype e-spill as an extension to Spark and evaluate it using six workloads on three different parallel platforms. Our evaluations show that e-spill improves performance by up to 3.80× and saves the cost of cluster operation on Amazon EC2 cloud by up to 51% over the baseline system following Spark Tuning Guidelines.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2018EDL8199/_p
Copy
@ARTICLE{e102-d_3_632,
author={Hakbeom JANG, Jonghyun BAE, Tae Jun HAM, Jae W. LEE, },
journal={IEICE TRANSACTIONS on Information},
title={Eager Memory Management for In-Memory Data Analytics},
year={2019},
volume={E102-D},
number={3},
pages={632-636},
abstract={This paper introduces e-spill, an eager spill mechanism, which dynamically finds the optimal spill-threshold by monitoring the GC time at runtime and thereby prevent expensive GC overhead. Our e-spill adopts a slow-start model to gradually increase the spill-threshold until it reaches the optimal point without substantial GCs. We prototype e-spill as an extension to Spark and evaluate it using six workloads on three different parallel platforms. Our evaluations show that e-spill improves performance by up to 3.80× and saves the cost of cluster operation on Amazon EC2 cloud by up to 51% over the baseline system following Spark Tuning Guidelines.},
keywords={},
doi={10.1587/transinf.2018EDL8199},
ISSN={1745-1361},
month={March},}
Copy
TY - JOUR
TI - Eager Memory Management for In-Memory Data Analytics
T2 - IEICE TRANSACTIONS on Information
SP - 632
EP - 636
AU - Hakbeom JANG
AU - Jonghyun BAE
AU - Tae Jun HAM
AU - Jae W. LEE
PY - 2019
DO - 10.1587/transinf.2018EDL8199
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E102-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2019
AB - This paper introduces e-spill, an eager spill mechanism, which dynamically finds the optimal spill-threshold by monitoring the GC time at runtime and thereby prevent expensive GC overhead. Our e-spill adopts a slow-start model to gradually increase the spill-threshold until it reaches the optimal point without substantial GCs. We prototype e-spill as an extension to Spark and evaluate it using six workloads on three different parallel platforms. Our evaluations show that e-spill improves performance by up to 3.80× and saves the cost of cluster operation on Amazon EC2 cloud by up to 51% over the baseline system following Spark Tuning Guidelines.
ER -