The search functionality is under construction.

IEICE TRANSACTIONS on Information

Sounds of Speech Based Spoken Document Categorization: A Subword Representation Method

Weidong QU, Katsuhiko SHIRAI

  • Full Text Views

    0

  • Cite this

Summary :

In this paper, we explore a method to the problem of spoken document categorization, which is the task of automatically assigning spoken documents into a set of predetermined categories. To categorize spoken documents, subword unit representations are used as an alternative to word units generated by either keyword spotting or large vocabulary continuous speech recognition (LVCSR). An advantage of using subword acoustic unit representations to spoken document categorization is that it does not require prior knowledge about the contents of the spoken documents and addresses the out of vocabulary (OOV) problem. Moreover, this method works in reliance on the sounds of speech rather than exact orthography. The use of subword units instead of words allows approximate matching on inaccurate transcriptions, makes "sounds-like" spoken document categorization possible. We also explore the performance of our method when the training set contains both perfect and errorful phonetic transcriptions, and hope the classifiers can learn from the confusion characteristics of recognizer and pronunciation variants of words to improve the robustness of whole system. Our experiments based on both artificial and real corrupted data sets show that the proposed method is more effective and robust than the word based method.

Publication
IEICE TRANSACTIONS on Information Vol.E87-D No.5 pp.1175-1184
Publication Date
2004/05/01
Publicized
Online ISSN
DOI
Type of Manuscript
Special Section PAPER (Special Section on Speech Dynamics by Ear, Eye, Mouth and Machine)
Category

Authors

Keyword