1-2hit |
In this paper, we investigate the language models using context-free grammar, bigram and quasi/simplified-trigram. For calculating of statistics of bigram and quasi/simplified-trigram, we used the set of sentences generated randomly from CFG that are legal in terms of semantics. We compared them on the perplexities for their models and the sentence recognition accuracies. The sentence recognition was experimented in the "UNIX-QA" task with the vocabulary size of 521 words. From these results, the perplexities of bigram and quasi-trigram were about 1.5-1.7 times and 1.2-1.3 times larger than the perplexity of CFG that corresponds to the most restricted grammar (perplexity=10.0), and we realized that quasi-trigram has the almost same ability of modeling as the restricted CFG when the set of plausible sentences in the task is given.
Seiichi NAKAGAWA Yoshimitsu HIRATA Isao MURASE Tomohiro TANOUE
This paper describes syntax/semantics oriented spoken Japanese understanding systems named "SPOJUSSYNO/SEMO" and compares them. At first these systems make Hidden-Markov-Models (HMM) based on word units automatically by concatenating syllables. Then a word lattice is hypothsized by using a word spotting algorithm and word-based HMMs for an input utterance. In SPOJUS-SYNO, the time-synchronous left-to-right parsing algorithm is executed to find the best word sequence from the word lattice according to syntactic & semantic knowledge represented by a context free semantic grammar. In SPOJUS-SEMO, the knowledges of syntax and semantics are represented by a dependency and case grammar. These systems were implemented in the "UNIX-QA" task with the vocabulary size of 521 words. Experimental result shows that the sentence recognition/understanding rate was about 80/87% for six male speakers for the SPOJUS-SYNO, but was very low performance for the SPOJUS-SEMO.