Using Hybrid HMM/BN Acoustic Models: Design and Implementation Issues

Konstantin MARKOV; Satoshi NAKAMURA

doi:10.1093/ietisy/e89-d.3.981

IEICE TRANSACTIONS on Information

Using Hybrid HMM/BN Acoustic Models: Design and Implementation Issues

Konstantin MARKOV, Satoshi NAKAMURA

Full Text Views

0

Cite this

Summary :

In recent years, the number of studies investigating new directions in speech modeling that goes beyond the conventional HMM has increased considerably. One promising approach is to use Bayesian Networks (BN) as speech models. Full recognition systems based on Dynamic BN as well as acoustic models using BN have been proposed lately. Our group at ATR has been developing a hybrid HMM/BN model, which is an HMM where the state probability distribution is modeled by a BN, instead of commonly used mixtures of Gaussian functions. In this paper, we describe how to use the hybrid HMM/BN acoustic models, especially emphasizing some design and implementation issues. The most essential part of HMM/BN model building is the choice of the state BN topology. As it is manually chosen, there are some factors that should be considered in this process. They include, but are not limited to, the type of data, the task and the available additional information. When context-dependent models are used, the state-level structure can be obtained by traditional methods. The HMM/BN parameter learning is based on the Viterbi training paradigm and consists of two alternating steps - BN training and HMM transition updates. For recognition, in some cases, BN inference is computationally equivalent to a mixture of Gaussians, which allows HMM/BN model to be used in existing decoders without any modification. We present two examples of HMM/BN model applications in speech recognition systems. Evaluations under various conditions and for different tasks showed that the HMM/BN model gives consistently better performance than the conventional HMM.

Publication: IEICE TRANSACTIONS on Information Vol.E89-D No.3 pp.981-988

Publication Date: 2006/03/01

Publicized

Online ISSN: 1745-1361

DOI: 10.1093/ietisy/e89-d.3.981

Type of Manuscript: Special Section PAPER (Special Section on Statistical Modeling for Speech Processing)

Category: Speech Recognition

Cite this

Copy

Konstantin MARKOV, Satoshi NAKAMURA, "Using Hybrid HMM/BN Acoustic Models: Design and Implementation Issues" in IEICE TRANSACTIONS on Information, vol. E89-D, no. 3, pp. 981-988, March 2006, doi: 10.1093/ietisy/e89-d.3.981.
Abstract: In recent years, the number of studies investigating new directions in speech modeling that goes beyond the conventional HMM has increased considerably. One promising approach is to use Bayesian Networks (BN) as speech models. Full recognition systems based on Dynamic BN as well as acoustic models using BN have been proposed lately. Our group at ATR has been developing a hybrid HMM/BN model, which is an HMM where the state probability distribution is modeled by a BN, instead of commonly used mixtures of Gaussian functions. In this paper, we describe how to use the hybrid HMM/BN acoustic models, especially emphasizing some design and implementation issues. The most essential part of HMM/BN model building is the choice of the state BN topology. As it is manually chosen, there are some factors that should be considered in this process. They include, but are not limited to, the type of data, the task and the available additional information. When context-dependent models are used, the state-level structure can be obtained by traditional methods. The HMM/BN parameter learning is based on the Viterbi training paradigm and consists of two alternating steps - BN training and HMM transition updates. For recognition, in some cases, BN inference is computationally equivalent to a mixture of Gaussians, which allows HMM/BN model to be used in existing decoders without any modification. We present two examples of HMM/BN model applications in speech recognition systems. Evaluations under various conditions and for different tasks showed that the HMM/BN model gives consistently better performance than the conventional HMM.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e89-d.3.981/_p

Copy

@ARTICLE{e89-d_3_981,
author={Konstantin MARKOV, Satoshi NAKAMURA, },
journal={IEICE TRANSACTIONS on Information},
title={Using Hybrid HMM/BN Acoustic Models: Design and Implementation Issues},
year={2006},
volume={E89-D},
number={3},
pages={981-988},
abstract={In recent years, the number of studies investigating new directions in speech modeling that goes beyond the conventional HMM has increased considerably. One promising approach is to use Bayesian Networks (BN) as speech models. Full recognition systems based on Dynamic BN as well as acoustic models using BN have been proposed lately. Our group at ATR has been developing a hybrid HMM/BN model, which is an HMM where the state probability distribution is modeled by a BN, instead of commonly used mixtures of Gaussian functions. In this paper, we describe how to use the hybrid HMM/BN acoustic models, especially emphasizing some design and implementation issues. The most essential part of HMM/BN model building is the choice of the state BN topology. As it is manually chosen, there are some factors that should be considered in this process. They include, but are not limited to, the type of data, the task and the available additional information. When context-dependent models are used, the state-level structure can be obtained by traditional methods. The HMM/BN parameter learning is based on the Viterbi training paradigm and consists of two alternating steps - BN training and HMM transition updates. For recognition, in some cases, BN inference is computationally equivalent to a mixture of Gaussians, which allows HMM/BN model to be used in existing decoders without any modification. We present two examples of HMM/BN model applications in speech recognition systems. Evaluations under various conditions and for different tasks showed that the HMM/BN model gives consistently better performance than the conventional HMM.},
keywords={},
doi={10.1093/ietisy/e89-d.3.981},
ISSN={1745-1361},
month={March},}

Copy

TY - JOUR
TI - Using Hybrid HMM/BN Acoustic Models: Design and Implementation Issues
T2 - IEICE TRANSACTIONS on Information
SP - 981
EP - 988
AU - Konstantin MARKOV
AU - Satoshi NAKAMURA
PY - 2006
DO - 10.1093/ietisy/e89-d.3.981
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E89-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2006
AB - In recent years, the number of studies investigating new directions in speech modeling that goes beyond the conventional HMM has increased considerably. One promising approach is to use Bayesian Networks (BN) as speech models. Full recognition systems based on Dynamic BN as well as acoustic models using BN have been proposed lately. Our group at ATR has been developing a hybrid HMM/BN model, which is an HMM where the state probability distribution is modeled by a BN, instead of commonly used mixtures of Gaussian functions. In this paper, we describe how to use the hybrid HMM/BN acoustic models, especially emphasizing some design and implementation issues. The most essential part of HMM/BN model building is the choice of the state BN topology. As it is manually chosen, there are some factors that should be considered in this process. They include, but are not limited to, the type of data, the task and the available additional information. When context-dependent models are used, the state-level structure can be obtained by traditional methods. The HMM/BN parameter learning is based on the Viterbi training paradigm and consists of two alternating steps - BN training and HMM transition updates. For recognition, in some cases, BN inference is computationally equivalent to a mixture of Gaussians, which allows HMM/BN model to be used in existing decoders without any modification. We present two examples of HMM/BN model applications in speech recognition systems. Evaluations under various conditions and for different tasks showed that the HMM/BN model gives consistently better performance than the conventional HMM.
ER -

IEICE TRANSACTIONS on Information

Using Hybrid HMM/BN Acoustic Models: Design and Implementation Issues

Summary :

Authors

Keyword

Latest Issue

Contents

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles

IEICE TRANSACTIONS on Information

Using Hybrid HMM/BN Acoustic Models: Design and Implementation Issues

Summary :

Authors

Keyword

Latest Issue

Contents

Copyrights notice of machine-translated contents

Cite this

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles