IEICE global.ieice.org Site

Author Search Result

[Author] Kazunori KOMATANI(3hit)

1-3hit

Automatic Allocation of Training Data for Speech Understanding Based on Multiple Model Combinations
Kazunori KOMATANI Mikio NAKANO Masaki KATSUMARU Kotaro FUNAKOSHI Tetsuya OGATA Hiroshi G. OKUNO

PAPER-Speech and Hearing

Vol:
E95-D No:9
Page(s):
2298-2307
The optimal way to build speech understanding modules depends on the amount of training data available. When only a small amount of training data is available, effective allocation of the data is crucial to preventing overfitting of statistical methods. We have developed a method for allocating a limited amount of training data in accordance with the amount available. Our method exploits rule-based methods for when the amount of data is small, which are included in our speech understanding framework based on multiple model combinations, i.e., multiple automatic speech recognition (ASR) modules and multiple language understanding (LU) modules, and then allocates training data preferentially to the modules that dominate the overall performance of speech understanding. Experimental evaluation showed that our allocation method consistently outperforms baseline methods that use a single ASR module and a single LU module while the amount of training data increases.
Posteriori Restoration of Turn-Taking and ASR Results for Incorrectly Segmented Utterances
Kazunori KOMATANI Naoki HOTTA Satoshi SATO Mikio NAKANO

PAPER-Speech and Hearing

Pubricized:
2015/07/24
Vol:
E98-D No:11
Page(s):
1923-1931
Appropriate turn-taking is important in spoken dialogue systems as well as generating correct responses. Especially if the dialogue features quick responses, a user utterance is often incorrectly segmented due to short pauses within it by voice activity detection (VAD). Incorrectly segmented utterances cause problems both in the automatic speech recognition (ASR) results and turn-taking: i.e., an incorrect VAD result leads to ASR errors and causes the system to start responding though the user is still speaking. We develop a method that performs a posteriori restoration for incorrectly segmented utterances and implement it as a plug-in for the MMDAgent open-source software. A crucial part of the method is to classify whether the restoration is required or not. We cast it as a binary classification problem of detecting originally single utterances from pairs of utterance fragments. Various features are used representing timing, prosody, and ASR result information. Experiments show that the proposed method outperformed a baseline with manually-selected features by 4.8% and 3.9% in cross-domain evaluations with two domains. More detailed analysis revealed that the dominant and domain-independent features were utterance intervals and results from the Gaussian mixture model (GMM).
Selecting Help Messages by Using Robust Grammar Verification for Handling Out-of-Grammar Utterances in Spoken Dialogue Systems
Kazunori KOMATANI Yuichiro FUKUBAYASHI Satoshi IKEDA Tetsuya OGATA Hiroshi G. OKUNO

PAPER-Speech and Hearing

Vol:
E93-D No:12
Page(s):
3359-3367
We address the issue of out-of-grammar (OOG) utterances in spoken dialogue systems by generating help messages. Help message generation for OOG utterances is a challenge because language understanding based on automatic speech recognition (ASR) of OOG utterances is usually erroneous; important words are often misrecognized or missing from such utterances. Our grammar verification method uses a weighted finite-state transducer, to accurately identify the grammar rule that the user intended to use for the utterance, even if important words are missing from the ASR results. We then use a ranking algorithm, RankBoost, to rank help message candidates in order of likely usefulness. Its features include the grammar verification results and the utterance history representing the user's experience.

Author Search Result

[Author] Kazunori KOMATANI(3hit)

Automatic Allocation of Training Data for Speech Understanding Based on Multiple Model Combinations

Posteriori Restoration of Turn-Taking and ASR Results for Incorrectly Segmented Utterances

Selecting Help Messages by Using Robust Grammar Verification for Handling Out-of-Grammar Utterances in Spoken Dialogue Systems

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles