The search functionality is under construction.

IEICE TRANSACTIONS on Information

Re-Ranking Approach of Spoken Term Detection Using Conditional Random Fields-Based Triphone Detection

Naoki SAWADA, Hiromitsu NISHIZAKI

  • Full Text Views

    0

  • Cite this

Summary :

This study proposes a two-pass spoken term detection (STD) method. The first pass uses a phoneme-based dynamic time warping (DTW)-based STD, and the second pass recomputes detection scores produced by the first pass using conditional random fields (CRF)-based triphone detectors. In the second-pass, we treat STD as a sequence labeling problem. We use CRF-based triphone detection models based on features generated from multiple types of phoneme-based transcriptions. The models train recognition error patterns such as phoneme-to-phoneme confusions in the CRF framework. Consequently, the models can detect a triphone comprising a query term with a detection probability. In the experimental evaluation of two types of test collections, the CRF-based approach worked well in the re-ranking process for the DTW-based detections. CRF-based re-ranking showed 2.1% and 2.0% absolute improvements in F-measure for each of the two test collections.

Publication
IEICE TRANSACTIONS on Information Vol.E99-D No.10 pp.2518-2527
Publication Date
2016/10/01
Publicized
2016/07/19
Online ISSN
1745-1361
DOI
10.1587/transinf.2016SLP0012
Type of Manuscript
Special Section PAPER (Special Section on Recent Advances in Machine Learning for Spoken Language Processing)
Category
Spoken term detection

Authors

Naoki SAWADA
  University of Yamanashi
Hiromitsu NISHIZAKI
  University of Yamanashi

Keyword