1-2hit |
Erasure codes have been considered as one of the most promising techniques for data reliability enhancement and storage efficiency in modern distributed storage systems. However, erasure codes often suffer from a time-consuming coding process which makes them nearly impractical. The opportunity to solve this problem probably rely on the parallelization of erasure-code-based application on the modern multi-/many-core processors to fully take advantage of the adequate hardware resources on those platforms. However, the complicated data allocation and limited I/O throughput pose a great challenge on the parallelization. To address this challenge, we propose a general multi-threaded parallel coding approach in this work. The approach consists of a general multi-threaded parallel coding model named as MTPerasure, and two detailed parallel coding algorithms, named as sdaParallel and ddaParallel, respectively, adapting to different I/O circumstances. MTPerasure is a general parallel coding model focusing on the high level data allocation, and it is applicable for all erasure codes and can be implemented without any modifications of the low level coding algorithms. The sdaParallel divides the data into several parts and the data parts are allocated to different threads statically in order to eliminate synchronization latency among multiple threads, which improves the parallel coding performance under the dummy I/O mode. The ddaParallel employs two threads to execute the I/O reading and writing on the basis of small pieces independently, which increases the I/O throughput. Furthermore, the data pieces are assigned to the coding thread dynamically. A special thread scheduling algorithm is also proposed to reduce thread migration latency. To evaluate our proposal, we parallelize the popular open source library jerasure based on our approach. And a detailed performance comparison with the original sequential coding program indicates that the proposed parallel approach outperforms the original sequential program by an extraordinary speedups from 1.4x up to 7x, and achieves better utilization of the computation and I/O resources.
Cong LIU Jianpeng ZHANG Guangming LI Shangce GAO Qingtian ZENG
During the execution of software, tremendous amounts of data can be recorded. By exploiting the execution data, one can discover behavioral models to describe the actual software execution. As a well-known open-source process mining toolkit, ProM integrates quantities of process mining techniques and enjoys a variety of applications in a broad range of areas. How to develop a better ProM software, both from user experience and software performance perspective, are of vital importance. To achieve this goal, we need to investigate the real execution behavior of ProM which can provide useful insights on its usage and how it responds to user operations. This paper aims to propose an effective approach to solve this problem. To this end, we first instrument existing ProM framework to capture execution logs without changing its architecture. Then a two-layered framework is introduced to support accurate ProM behavior discovery by characterizing both user interaction behavior and plug-in calling behavior separately. Next, detailed discovery techniques to obtain user interaction behavior model and plug-in calling behavior model are proposed. All proposed approaches have been implemented.