Speech Recognition Based on Fusion of Visual and Auditory Information Using Full-Framse Color Image

Satoru IGAWA; Akio OGIHARA; Akira SHINTANI; Shinobu TAKAMATSU

Speech Recognition Based on Fusion of Visual and Auditory Information Using Full-Framse Color Image

Satoru IGAWA, Akio OGIHARA, Akira SHINTANI, Shinobu TAKAMATSU

Full Text Views

0

Cite this

Summary :

We propose a method to fuse auditory information and visual information for accurate speech recognition. This method fuses two kinds of information by using Iinear combination after calculating two kinds of probabilities by HMM for each word. In addition, we use full-frame color image as visual information in order to improve the accuracy of the proposed speech recognition system. We have performed experiments comparing the proposed method with the method using either auditory information or visual information, and confirmed the validity of the proposed method.

Publication: IEICE TRANSACTIONS on Fundamentals Vol.E79-A No.11 pp.1836-1840

Publication Date: 1996/11/25

Publicized

Online ISSN

DOI

Type of Manuscript: Special Section LETTER (Special Section of Letters Selected from the 1996 IEICE General Conference)

Category

Cite this

Copy

Satoru IGAWA, Akio OGIHARA, Akira SHINTANI, Shinobu TAKAMATSU, "Speech Recognition Based on Fusion of Visual and Auditory Information Using Full-Framse Color Image" in IEICE TRANSACTIONS on Fundamentals, vol. E79-A, no. 11, pp. 1836-1840, November 1996, doi: .
Abstract: We propose a method to fuse auditory information and visual information for accurate speech recognition. This method fuses two kinds of information by using Iinear combination after calculating two kinds of probabilities by HMM for each word. In addition, we use full-frame color image as visual information in order to improve the accuracy of the proposed speech recognition system. We have performed experiments comparing the proposed method with the method using either auditory information or visual information, and confirmed the validity of the proposed method.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/e79-a_11_1836/_p

Copy

@ARTICLE{e79-a_11_1836,
author={Satoru IGAWA, Akio OGIHARA, Akira SHINTANI, Shinobu TAKAMATSU, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Speech Recognition Based on Fusion of Visual and Auditory Information Using Full-Framse Color Image},
year={1996},
volume={E79-A},
number={11},
pages={1836-1840},
abstract={We propose a method to fuse auditory information and visual information for accurate speech recognition. This method fuses two kinds of information by using Iinear combination after calculating two kinds of probabilities by HMM for each word. In addition, we use full-frame color image as visual information in order to improve the accuracy of the proposed speech recognition system. We have performed experiments comparing the proposed method with the method using either auditory information or visual information, and confirmed the validity of the proposed method.},
keywords={},
doi={},
ISSN={},
month={November},}

Copy

TY - JOUR
TI - Speech Recognition Based on Fusion of Visual and Auditory Information Using Full-Framse Color Image
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 1836
EP - 1840
AU - Satoru IGAWA
AU - Akio OGIHARA
AU - Akira SHINTANI
AU - Shinobu TAKAMATSU
PY - 1996
DO -
JO - IEICE TRANSACTIONS on Fundamentals
SN -
VL - E79-A
IS - 11
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - November 1996
AB - We propose a method to fuse auditory information and visual information for accurate speech recognition. This method fuses two kinds of information by using Iinear combination after calculating two kinds of probabilities by HMM for each word. In addition, we use full-frame color image as visual information in order to improve the accuracy of the proposed speech recognition system. We have performed experiments comparing the proposed method with the method using either auditory information or visual information, and confirmed the validity of the proposed method.
ER -