This paper describes an overview of Japanese text dictation system composed of an acoustic processor and a linguistic processor. The system deals with 843 conceptual words and 431 functional words. The phoneme recognition is carried out using a modified LVQ2 method which we propose. The phoneme recognition score was 86.1% for 226 sentences uttered by two male speakers. The linguistic processor is composed of a processor for spotting Bunsetsu-units and a syntactic processor. The structure of the Bunsetsu-unit is effectively described by a finite-state automaton. The test-set perplexity of the finite-state automaton is 230. In the processor for spotting Bunsetsu-units, using a syntax-driven continuous-DP matching algorithm, the Bunsetsu-units are spotted from a recognized phoneme sequence and then a Bunsetsu-unit lattice is generated. In the syntactic processor, the Bunsetsu-unit lattice is parsed based on the dependency grammar. The dependency grammar is expressed as the correspondence between a FEATURE marker in a modifier-Bunsetsu and a SLOT-FILLER marker in a head-Bunsetsu. The recognition scores of the Bunsetsu-unit and conceptual words were 73.2% and 85.7% for 226 sentences uttered by the two male speakers.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Shozo MAKINO, Akinori ITO, Mitsuru ENDO, Ken'iti KIDO, "A Japanese Text Dictation System Based on Phoneme Recognition and a Dependency Grammar" in IEICE TRANSACTIONS on Fundamentals,
vol. E74-A, no. 7, pp. 1773-1782, July 1991, doi: .
Abstract: This paper describes an overview of Japanese text dictation system composed of an acoustic processor and a linguistic processor. The system deals with 843 conceptual words and 431 functional words. The phoneme recognition is carried out using a modified LVQ2 method which we propose. The phoneme recognition score was 86.1% for 226 sentences uttered by two male speakers. The linguistic processor is composed of a processor for spotting Bunsetsu-units and a syntactic processor. The structure of the Bunsetsu-unit is effectively described by a finite-state automaton. The test-set perplexity of the finite-state automaton is 230. In the processor for spotting Bunsetsu-units, using a syntax-driven continuous-DP matching algorithm, the Bunsetsu-units are spotted from a recognized phoneme sequence and then a Bunsetsu-unit lattice is generated. In the syntactic processor, the Bunsetsu-unit lattice is parsed based on the dependency grammar. The dependency grammar is expressed as the correspondence between a FEATURE marker in a modifier-Bunsetsu and a SLOT-FILLER marker in a head-Bunsetsu. The recognition scores of the Bunsetsu-unit and conceptual words were 73.2% and 85.7% for 226 sentences uttered by the two male speakers.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/e74-a_7_1773/_p
Copy
@ARTICLE{e74-a_7_1773,
author={Shozo MAKINO, Akinori ITO, Mitsuru ENDO, Ken'iti KIDO, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={A Japanese Text Dictation System Based on Phoneme Recognition and a Dependency Grammar},
year={1991},
volume={E74-A},
number={7},
pages={1773-1782},
abstract={This paper describes an overview of Japanese text dictation system composed of an acoustic processor and a linguistic processor. The system deals with 843 conceptual words and 431 functional words. The phoneme recognition is carried out using a modified LVQ2 method which we propose. The phoneme recognition score was 86.1% for 226 sentences uttered by two male speakers. The linguistic processor is composed of a processor for spotting Bunsetsu-units and a syntactic processor. The structure of the Bunsetsu-unit is effectively described by a finite-state automaton. The test-set perplexity of the finite-state automaton is 230. In the processor for spotting Bunsetsu-units, using a syntax-driven continuous-DP matching algorithm, the Bunsetsu-units are spotted from a recognized phoneme sequence and then a Bunsetsu-unit lattice is generated. In the syntactic processor, the Bunsetsu-unit lattice is parsed based on the dependency grammar. The dependency grammar is expressed as the correspondence between a FEATURE marker in a modifier-Bunsetsu and a SLOT-FILLER marker in a head-Bunsetsu. The recognition scores of the Bunsetsu-unit and conceptual words were 73.2% and 85.7% for 226 sentences uttered by the two male speakers.},
keywords={},
doi={},
ISSN={},
month={July},}
Copy
TY - JOUR
TI - A Japanese Text Dictation System Based on Phoneme Recognition and a Dependency Grammar
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 1773
EP - 1782
AU - Shozo MAKINO
AU - Akinori ITO
AU - Mitsuru ENDO
AU - Ken'iti KIDO
PY - 1991
DO -
JO - IEICE TRANSACTIONS on Fundamentals
SN -
VL - E74-A
IS - 7
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - July 1991
AB - This paper describes an overview of Japanese text dictation system composed of an acoustic processor and a linguistic processor. The system deals with 843 conceptual words and 431 functional words. The phoneme recognition is carried out using a modified LVQ2 method which we propose. The phoneme recognition score was 86.1% for 226 sentences uttered by two male speakers. The linguistic processor is composed of a processor for spotting Bunsetsu-units and a syntactic processor. The structure of the Bunsetsu-unit is effectively described by a finite-state automaton. The test-set perplexity of the finite-state automaton is 230. In the processor for spotting Bunsetsu-units, using a syntax-driven continuous-DP matching algorithm, the Bunsetsu-units are spotted from a recognized phoneme sequence and then a Bunsetsu-unit lattice is generated. In the syntactic processor, the Bunsetsu-unit lattice is parsed based on the dependency grammar. The dependency grammar is expressed as the correspondence between a FEATURE marker in a modifier-Bunsetsu and a SLOT-FILLER marker in a head-Bunsetsu. The recognition scores of the Bunsetsu-unit and conceptual words were 73.2% and 85.7% for 226 sentences uttered by the two male speakers.
ER -