Neural Networks and the Time-Sliced Paradigm for Speech Recognition

Ingrid KIRSCHNING; Jun-Ichi AOE

Neural Networks and the Time-Sliced Paradigm for Speech Recognition

Ingrid KIRSCHNING, Jun-Ichi AOE

Full Text Views

0

Cite this

Summary :

The Time-Slicing paradigm is a newly developed method for the training of neural networks for speech recognition. The neural net is trained to spot the syllables in a continuous stream of speech. It generates a transcription of the utterance, be it a word, a phrase, etc. Combined with a simple error recovery method the desired units (words or phrases) can be retrieved. This paradigm uses a recurrent neural network trained in a modular fashion with natural connectionist glue. It processes the input signal sequentially regardless of the input's length and immediately extracts the syllables spotted in the speech stream. As an example, this character string is then compared to a set of possible words, picking out the five closest candidates. In this paper we describe the time-slicing paradigm and the training of the recurrent neural network together with details about the training samples. It also introduces the concept of natural connectionist glue and the recurrent neural network's architecture used for this purpose. Additionally we explain the errors found in the output and the process to reduce them and recover the correct words. The recognition rates of the network and the recovery rates for the words are also shown. The presented examples and recognition rates demonstrate the potential of the time-slicing method for continuous speech recognition.

Publication: IEICE TRANSACTIONS on Information Vol.E79-D No.12 pp.1690-1699

Publication Date: 1996/12/25

Publicized

Online ISSN

DOI

Type of Manuscript: PAPER

Category: Speech Processing and Acoustics

Cite this

Copy

Ingrid KIRSCHNING, Jun-Ichi AOE, "Neural Networks and the Time-Sliced Paradigm for Speech Recognition" in IEICE TRANSACTIONS on Information, vol. E79-D, no. 12, pp. 1690-1699, December 1996, doi: .
Abstract: The Time-Slicing paradigm is a newly developed method for the training of neural networks for speech recognition. The neural net is trained to spot the syllables in a continuous stream of speech. It generates a transcription of the utterance, be it a word, a phrase, etc. Combined with a simple error recovery method the desired units (words or phrases) can be retrieved. This paradigm uses a recurrent neural network trained in a modular fashion with natural connectionist glue. It processes the input signal sequentially regardless of the input's length and immediately extracts the syllables spotted in the speech stream. As an example, this character string is then compared to a set of possible words, picking out the five closest candidates. In this paper we describe the time-slicing paradigm and the training of the recurrent neural network together with details about the training samples. It also introduces the concept of natural connectionist glue and the recurrent neural network's architecture used for this purpose. Additionally we explain the errors found in the output and the process to reduce them and recover the correct words. The recognition rates of the network and the recovery rates for the words are also shown. The presented examples and recognition rates demonstrate the potential of the time-slicing method for continuous speech recognition.
URL: https://global.ieice.org/en_transactions/information/10.1587/e79-d_12_1690/_p

Copy

@ARTICLE{e79-d_12_1690,
author={Ingrid KIRSCHNING, Jun-Ichi AOE, },
journal={IEICE TRANSACTIONS on Information},
title={Neural Networks and the Time-Sliced Paradigm for Speech Recognition},
year={1996},
volume={E79-D},
number={12},
pages={1690-1699},
abstract={The Time-Slicing paradigm is a newly developed method for the training of neural networks for speech recognition. The neural net is trained to spot the syllables in a continuous stream of speech. It generates a transcription of the utterance, be it a word, a phrase, etc. Combined with a simple error recovery method the desired units (words or phrases) can be retrieved. This paradigm uses a recurrent neural network trained in a modular fashion with natural connectionist glue. It processes the input signal sequentially regardless of the input's length and immediately extracts the syllables spotted in the speech stream. As an example, this character string is then compared to a set of possible words, picking out the five closest candidates. In this paper we describe the time-slicing paradigm and the training of the recurrent neural network together with details about the training samples. It also introduces the concept of natural connectionist glue and the recurrent neural network's architecture used for this purpose. Additionally we explain the errors found in the output and the process to reduce them and recover the correct words. The recognition rates of the network and the recovery rates for the words are also shown. The presented examples and recognition rates demonstrate the potential of the time-slicing method for continuous speech recognition.},
keywords={},
doi={},
ISSN={},
month={December},}

Copy

TY - JOUR
TI - Neural Networks and the Time-Sliced Paradigm for Speech Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 1690
EP - 1699
AU - Ingrid KIRSCHNING
AU - Jun-Ichi AOE
PY - 1996
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E79-D
IS - 12
JA - IEICE TRANSACTIONS on Information
Y1 - December 1996
AB - The Time-Slicing paradigm is a newly developed method for the training of neural networks for speech recognition. The neural net is trained to spot the syllables in a continuous stream of speech. It generates a transcription of the utterance, be it a word, a phrase, etc. Combined with a simple error recovery method the desired units (words or phrases) can be retrieved. This paradigm uses a recurrent neural network trained in a modular fashion with natural connectionist glue. It processes the input signal sequentially regardless of the input's length and immediately extracts the syllables spotted in the speech stream. As an example, this character string is then compared to a set of possible words, picking out the five closest candidates. In this paper we describe the time-slicing paradigm and the training of the recurrent neural network together with details about the training samples. It also introduces the concept of natural connectionist glue and the recurrent neural network's architecture used for this purpose. Additionally we explain the errors found in the output and the process to reduce them and recover the correct words. The recognition rates of the network and the recovery rates for the words are also shown. The presented examples and recognition rates demonstrate the potential of the time-slicing method for continuous speech recognition.
ER -