Quad-core cpus have been a common desktop configuration for today's office. The increasing number of processors on a single chip opens new opportunity for parallel computing. Our goal is to make use of the multi-core as well as multi-processor architectures to speed up large-scale data mining algorithms. In this paper, we present a general parallel learning framework, Cut-And-Stitch, for training hidden Markov chain models. Particularly, we propose two model-specific variants, CAS-LDS for learning linear dynamical systems (LDS) and CAS-HMM for learning hidden Markov models (HMM). Our main contribution is a novel method to handle the data dependencies due to the chain structure of hidden variables, so as to parallelize the EM-based parameter learning algorithm. We implement CAS-LDS and CAS-HMM using OpenMP on two supercomputers and a quad-core commercial desktop. The experimental results show that parallel algorithms using Cut-And-Stitch achieve comparable accuracy and almost linear speedups over the traditional serial version.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Lei LI, Bin FU, Christos FALOUTSOS, "Efficient Parallel Learning of Hidden Markov Chain Models on SMPs" in IEICE TRANSACTIONS on Information,
vol. E93-D, no. 6, pp. 1330-1342, June 2010, doi: 10.1587/transinf.E93.D.1330.
Abstract: Quad-core cpus have been a common desktop configuration for today's office. The increasing number of processors on a single chip opens new opportunity for parallel computing. Our goal is to make use of the multi-core as well as multi-processor architectures to speed up large-scale data mining algorithms. In this paper, we present a general parallel learning framework, Cut-And-Stitch, for training hidden Markov chain models. Particularly, we propose two model-specific variants, CAS-LDS for learning linear dynamical systems (LDS) and CAS-HMM for learning hidden Markov models (HMM). Our main contribution is a novel method to handle the data dependencies due to the chain structure of hidden variables, so as to parallelize the EM-based parameter learning algorithm. We implement CAS-LDS and CAS-HMM using OpenMP on two supercomputers and a quad-core commercial desktop. The experimental results show that parallel algorithms using Cut-And-Stitch achieve comparable accuracy and almost linear speedups over the traditional serial version.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E93.D.1330/_p
Copy
@ARTICLE{e93-d_6_1330,
author={Lei LI, Bin FU, Christos FALOUTSOS, },
journal={IEICE TRANSACTIONS on Information},
title={Efficient Parallel Learning of Hidden Markov Chain Models on SMPs},
year={2010},
volume={E93-D},
number={6},
pages={1330-1342},
abstract={Quad-core cpus have been a common desktop configuration for today's office. The increasing number of processors on a single chip opens new opportunity for parallel computing. Our goal is to make use of the multi-core as well as multi-processor architectures to speed up large-scale data mining algorithms. In this paper, we present a general parallel learning framework, Cut-And-Stitch, for training hidden Markov chain models. Particularly, we propose two model-specific variants, CAS-LDS for learning linear dynamical systems (LDS) and CAS-HMM for learning hidden Markov models (HMM). Our main contribution is a novel method to handle the data dependencies due to the chain structure of hidden variables, so as to parallelize the EM-based parameter learning algorithm. We implement CAS-LDS and CAS-HMM using OpenMP on two supercomputers and a quad-core commercial desktop. The experimental results show that parallel algorithms using Cut-And-Stitch achieve comparable accuracy and almost linear speedups over the traditional serial version.},
keywords={},
doi={10.1587/transinf.E93.D.1330},
ISSN={1745-1361},
month={June},}
Copy
TY - JOUR
TI - Efficient Parallel Learning of Hidden Markov Chain Models on SMPs
T2 - IEICE TRANSACTIONS on Information
SP - 1330
EP - 1342
AU - Lei LI
AU - Bin FU
AU - Christos FALOUTSOS
PY - 2010
DO - 10.1587/transinf.E93.D.1330
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E93-D
IS - 6
JA - IEICE TRANSACTIONS on Information
Y1 - June 2010
AB - Quad-core cpus have been a common desktop configuration for today's office. The increasing number of processors on a single chip opens new opportunity for parallel computing. Our goal is to make use of the multi-core as well as multi-processor architectures to speed up large-scale data mining algorithms. In this paper, we present a general parallel learning framework, Cut-And-Stitch, for training hidden Markov chain models. Particularly, we propose two model-specific variants, CAS-LDS for learning linear dynamical systems (LDS) and CAS-HMM for learning hidden Markov models (HMM). Our main contribution is a novel method to handle the data dependencies due to the chain structure of hidden variables, so as to parallelize the EM-based parameter learning algorithm. We implement CAS-LDS and CAS-HMM using OpenMP on two supercomputers and a quad-core commercial desktop. The experimental results show that parallel algorithms using Cut-And-Stitch achieve comparable accuracy and almost linear speedups over the traditional serial version.
ER -