Confidence Measure Based on Context Consistency Using Word Occurrence Probability and Topic Adaptation for Spoken Term Detection

Haiyang LI; Tieran ZHENG; Guibin ZHENG; Jiqing HAN

doi:10.1587/transinf.E97.D.554

Confidence Measure Based on Context Consistency Using Word Occurrence Probability and Topic Adaptation for Spoken Term Detection

Haiyang LI, Tieran ZHENG, Guibin ZHENG, Jiqing HAN

Full Text Views

0

Cite this

Summary :

In this paper, we propose a novel confidence measure to improve the performance of spoken term detection (STD). The proposed confidence measure is based on the context consistency between a hypothesized word and its context in a word lattice. The main contribution of this paper is to compute the context consistency by considering the uncertainty in the results of speech recognition and the effect of topic. To measure the uncertainty of the context, we employ the word occurrence probability, which is obtained through combining the overlapping hypotheses in a word posterior lattice. To handle the effect of topic, we propose a method of topic adaptation. The adaptation method firstly classifies the spoken document according to the topics and then computes the context consistency of the hypothesized word with the topic-specific measure of semantic similarity. Additionally, we apply the topic-specific measure of semantic similarity by two means, and they are performed respectively with the information of the top-1 topic and the mixture of all topics according to topic classification. The experiments conducted on the Hub-4NE Mandarin database show that both the occurrence probability of context word and the topic adaptation are effective for the confidence measure of STD. The proposed confidence measure performs better compared with the one ignoring the uncertainty of the context or the one using a non-topic method.

Publication: IEICE TRANSACTIONS on Information Vol.E97-D No.3 pp.554-561

Publication Date: 2014/03/01

Publicized

Online ISSN: 1745-1361

DOI: 10.1587/transinf.E97.D.554

Type of Manuscript: PAPER

Category: Speech and Hearing

Authors

Haiyang LI
  Harbin Institute of Technology
Tieran ZHENG
  Harbin Institute of Technology
Guibin ZHENG
  Harbin Institute of Technology
Jiqing HAN
  Harbin Institute of Technology

Keyword

spoken term detection, confidence measure, context consistency, sematic similarity, topic adaptation

Cite this

Copy

Haiyang LI, Tieran ZHENG, Guibin ZHENG, Jiqing HAN, "Confidence Measure Based on Context Consistency Using Word Occurrence Probability and Topic Adaptation for Spoken Term Detection" in IEICE TRANSACTIONS on Information, vol. E97-D, no. 3, pp. 554-561, March 2014, doi: 10.1587/transinf.E97.D.554.
Abstract: In this paper, we propose a novel confidence measure to improve the performance of spoken term detection (STD). The proposed confidence measure is based on the context consistency between a hypothesized word and its context in a word lattice. The main contribution of this paper is to compute the context consistency by considering the uncertainty in the results of speech recognition and the effect of topic. To measure the uncertainty of the context, we employ the word occurrence probability, which is obtained through combining the overlapping hypotheses in a word posterior lattice. To handle the effect of topic, we propose a method of topic adaptation. The adaptation method firstly classifies the spoken document according to the topics and then computes the context consistency of the hypothesized word with the topic-specific measure of semantic similarity. Additionally, we apply the topic-specific measure of semantic similarity by two means, and they are performed respectively with the information of the top-1 topic and the mixture of all topics according to topic classification. The experiments conducted on the Hub-4NE Mandarin database show that both the occurrence probability of context word and the topic adaptation are effective for the confidence measure of STD. The proposed confidence measure performs better compared with the one ignoring the uncertainty of the context or the one using a non-topic method.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E97.D.554/_p

Copy

@ARTICLE{e97-d_3_554,
author={Haiyang LI, Tieran ZHENG, Guibin ZHENG, Jiqing HAN, },
journal={IEICE TRANSACTIONS on Information},
title={Confidence Measure Based on Context Consistency Using Word Occurrence Probability and Topic Adaptation for Spoken Term Detection},
year={2014},
volume={E97-D},
number={3},
pages={554-561},
abstract={In this paper, we propose a novel confidence measure to improve the performance of spoken term detection (STD). The proposed confidence measure is based on the context consistency between a hypothesized word and its context in a word lattice. The main contribution of this paper is to compute the context consistency by considering the uncertainty in the results of speech recognition and the effect of topic. To measure the uncertainty of the context, we employ the word occurrence probability, which is obtained through combining the overlapping hypotheses in a word posterior lattice. To handle the effect of topic, we propose a method of topic adaptation. The adaptation method firstly classifies the spoken document according to the topics and then computes the context consistency of the hypothesized word with the topic-specific measure of semantic similarity. Additionally, we apply the topic-specific measure of semantic similarity by two means, and they are performed respectively with the information of the top-1 topic and the mixture of all topics according to topic classification. The experiments conducted on the Hub-4NE Mandarin database show that both the occurrence probability of context word and the topic adaptation are effective for the confidence measure of STD. The proposed confidence measure performs better compared with the one ignoring the uncertainty of the context or the one using a non-topic method.},
keywords={},
doi={10.1587/transinf.E97.D.554},
ISSN={1745-1361},
month={March},}

Copy

TY - JOUR
TI - Confidence Measure Based on Context Consistency Using Word Occurrence Probability and Topic Adaptation for Spoken Term Detection
T2 - IEICE TRANSACTIONS on Information
SP - 554
EP - 561
AU - Haiyang LI
AU - Tieran ZHENG
AU - Guibin ZHENG
AU - Jiqing HAN
PY - 2014
DO - 10.1587/transinf.E97.D.554
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E97-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2014
AB - In this paper, we propose a novel confidence measure to improve the performance of spoken term detection (STD). The proposed confidence measure is based on the context consistency between a hypothesized word and its context in a word lattice. The main contribution of this paper is to compute the context consistency by considering the uncertainty in the results of speech recognition and the effect of topic. To measure the uncertainty of the context, we employ the word occurrence probability, which is obtained through combining the overlapping hypotheses in a word posterior lattice. To handle the effect of topic, we propose a method of topic adaptation. The adaptation method firstly classifies the spoken document according to the topics and then computes the context consistency of the hypothesized word with the topic-specific measure of semantic similarity. Additionally, we apply the topic-specific measure of semantic similarity by two means, and they are performed respectively with the information of the top-1 topic and the mixture of all topics according to topic classification. The experiments conducted on the Hub-4NE Mandarin database show that both the occurrence probability of context word and the topic adaptation are effective for the confidence measure of STD. The proposed confidence measure performs better compared with the one ignoring the uncertainty of the context or the one using a non-topic method.
ER -