The search functionality is under construction.

IEICE TRANSACTIONS on Information

Open Access
Hidden Conditional Neural Fields for Continuous Phoneme Speech Recognition

Yasuhisa FUJII, Kazumasa YAMAMOTO, Seiichi NAKAGAWA

  • Full Text Views

    43

  • Cite this
  • Free PDF (324.9KB)

Summary :

In this paper, we propose Hidden Conditional Neural Fields (HCNF) for continuous phoneme speech recognition, which are a combination of Hidden Conditional Random Fields (HCRF) and a Multi-Layer Perceptron (MLP), and inherit their merits, namely, the discriminative property for sequences from HCRF and the ability to extract non-linear features from an MLP. HCNF can incorporate many types of features from which non-linear features can be extracted, and is trained by sequential criteria. We first present the formulation of HCNF and then examine three methods to further improve automatic speech recognition using HCNF, which is an objective function that explicitly considers training errors, provides a hierarchical tandem-style feature and includes a deep non-linear feature extractor for the observation function. We show that HCNF can be trained realistically without any initial model and outperforms HCRF and the triphone hidden Markov model trained by the minimum phone error (MPE) manner using experimental results for continuous English phoneme recognition on the TIMIT core test set and Japanese phoneme recognition on the IPA 100 test set.

Publication
IEICE TRANSACTIONS on Information Vol.E95-D No.8 pp.2094-2104
Publication Date
2012/08/01
Publicized
Online ISSN
1745-1361
DOI
10.1587/transinf.E95.D.2094
Type of Manuscript
PAPER
Category
Speech and Hearing

Authors

Keyword