IEICE global.ieice.org Site

Keyword Search Result

[Keyword] EE(4073hit)

4001-4020hit(4073hit)

Effects of Link Communication Time on Optimal Load Balancing in Tree Hierarchy Network Configurations
Jie LI Hisao KAMEDA Kentaro SHIMIZU

PAPER-Computer Networks

Vol:
E76-D No:2
Page(s):
199-209
In this paper, optimal static load balancing in a tree hierarchy network that consists of a set of heterogeneous host computers is considered. It is formulated as a nonlinear optimization problem. We study the effects of the link communication time on the optimal link flow rate (i.e., the rate at which a node forwards jobs to other nodes for remote processing), the optimal node load (i.e., the rate at which jobs are processed at a node), and the optimal mean response time, by parametric analysis. We show that the entire network can be divided into several independent sub-tree networks with respect to the link flow rates and node loads. We find that the communication time of a link has the effects only on the link flow rates and the loads on nodes that are in the same sub-tree network. The increase in the communication time of a link causes the decrease in the link flow rates of its descendant nodes, its ancestor nodes and itself, but causes the increase in the link flow rates of other nodes in the same sub-tree network. It also causes the increase in the loads of its descendant nodes and itself, but causes the decrease in the loads of other nodes in the same sub-tree network. In general, it causes the increase in the mean response time.
Some Properties of Kleene-Stone Logic Functions and Their Canonical Disjunctive Form
Noboru TAKAGI Masao MUKAIDONO

PAPER-Computer Hardware and Design

Vol:
E76-D No:2
Page(s):
163-170
In this paper, we will define Kleene-Stone logic functions which are functions F: [0, 1]n[0, 1] including the intuitionistic negation into fuzzy logic functions, and they can easily represent the concepts of necessity and possibility which are important concepts of many-valued logic systems. A set of Kleene-Stone logic functions is one of the models of Kleene-Stone algebra, which is both Kleene algebra and Stone algebra, as same as a set of fuzzy logic functions is one of the models of Kleene algebra. This paper, especially, describes some algebraic properties and representation of Kleene-Stone logic functions.
Hybrid Photonic-Microwave Systems and Devices
Peter R. HERCZFELD

INVITED PAPER

Vol:
E76-C No:2
Page(s):
191-197
Research in optical microwave interaction, at its earlier stages, was spured by the desire to make an optically fed and controlled phased array antenna with monolithic microwave integrated circuit (MMIC) transmit/receive (T/R) modules. In the first part of this paper experimental results are presented demonstrating an optically fed phased array antenna operating at C-band in the 5.5 to 5.8 GHz frequency range. The present system consists of two optically fed 14 subarrays with MMIC based active T/R modules. Custom designed fiber optic links have been employed to provide distribution of data and frequency reference signals to phased array antenna. One of the challenges of the future is the development of better interfaces between electronic (microwave) and optical components, including the chip level merging of photonic and electronic components on III-V compounds. This aspect of the research is covered in the second half of the paper.
Speaker Adaptation Based on Vector Field Smoothing
Hiroaki HATTORI Shigeki SAGAYAMA

PAPER-Speech Processing

Vol:
E76-D No:2
Page(s):
227-234
This paper describes a new supervised speaker adaptation method based on vector field smoothing, for small size adaptation data. This method assumes that the correspondence of feature vectors between speakers can be viewed as a kind of smooth vector field, and interpolation and smoothing of the correspondence are introduced into the adaptation process for higher adaptation performance with small size data. The proposed adaptation method was applied to discrete HMM based speech recognition and evaluated in Japanese phoneme and phrase recognition experiments. Using 10 words as the adaptation data, the proposed method produced almost the same results as the conventional codebook mapping method with 25 words. These experiments clearly comfirmed the effectiveness of the proposed method.
A Characterization of Kleene-Stone Logic Functions
Noboru TAKAGI Masao MUKAIDONO

PAPER-Computer Hardware and Design

Vol:
E76-D No:2
Page(s):
171-178
Kleene-Stone algebra is both Kleene algebra and Stone algebra. The set of Kleene-Stone logic functions discussed in this paper is one of the models of Kleene-Stone algebra, and they can easily represent the concepts of necessity and possibility which are important concepts for many-valued logic systems. Main results of this paper are that the followings are clarified: a necessary and sufficient condition for a function to be a Kleene-Stone logic function and a formula representing the number of n-variable Kleene-Stone logic functions.
Three Different LR Parsing Algorithms for Phoneme-Context-Dependent HMM-Based Continuous Speech Recognition
Akito NAGAI Shigeki SAGAYAMA Kenji KITA Hideaki KIKUCHI

PAPER

Vol:
E76-D No:1
Page(s):
29-37
This paper discusses three approaches for combining an efficient LR parser and phoneme-context-dependent HMMs and compares them through continuous speech recognition experiments. In continuous speech recognition, phoneme-context-dependent allophonic models are considered very helpful for enhancing the recognition accuracy. They precisely represent allophonic variations caused by the difference in phoneme-contexts. With grammatical constraints based on a context free grammar (CFG), a generalized LR parser is one of the most efficient parsing algorithms for speech recognition. Therefore, the combination of allophonic models and a generalized LR parser is a powerful scheme enabling accurate and efficient speech recognition. In this paper, three phoneme-context-dependent LR parsing algorithms are proposed, which make it possible to drive allophonic HMMs. The algorithms are outlined as follows: (1) Algorithm for predicting the phonemic context dynamically in the LR parser using a phoneme-context-independent LR table. (2) Algorithm for converting an LR table into a phoneme-context-dependent LR table. (3) Algorithm for converting a CFG into a phoneme-context-dependent CFG. This paper also includes discussion of the results of recognition experiments, and a comparison of performance and efficiency of these three algorithms.
A Linguistic Procedure for an Extension Number Guidance System
Naomi INOUE Izuru NOGAITO Masahiko TAKAHASHI

PAPER

Vol:
E76-D No:1
Page(s):
106-111
This paper describes the linguistic procedure of our speech dialogue system. The procedure is composed of two processes, syntactic analysis using a finite state network, and discourse analysis using a plan recognition model. The finite state network is compiled from regular grammar. The regular grammar is described in order to accept sentences with various styles, for example ellipsis and inversion. The regular grammar is automatically generated from the skeleton of the grammar. The discourse analysis module understands the utterance, generates the next question for users and also predicts words which will be in the next utterance. For an extension number guidance task, we obtained correct recognition results for 93% of input sentences without word prediction and for 98% if prediction results include proper words.
LR Parsing with a Category Reachability Test Applied to Speech Recognition
Kenji KITA Tsuyoshi MORIMOTO Shigeki SAGAYAMA

PAPER

Vol:
E76-D No:1
Page(s):
23-28
In this paper, we propose an extended LR parsing algorithm, called LR parsing with a category reachability test (the LR-CRT algorithm). The LR-CRT algorithm enables a parser to efficiently recognize those sentences that belong to a specified grammatical category. The key point of the algorithm is to use an augmented LR parsing table in which each action entry contains a set of reachable categories. When executing a shift or reduce action, the parser checks whether the action can reach a given category using the augmented table. We apply the LR-CRT algorithm to improve a speech recognition system based on two-level LR parsing. This system uses two kinds of grammars, inter- and intra-phrase grammars, to recognize Japanese sentential speech. Two-level LR parsing guides the search of speech recognition through two-level symbol prediction, phrase category prediction and phone prediction, based on these grammars. The LR-CRT algorithm makes possible the efficient phone prediction based on the phrase category prediction. The system was evaluated using sentential speech data uttered phrase by phrase, and attained a word accuracy of 97.5% and a sentence accuracy of 91.2%
Methods to Securely Realize Caller-Authenticated and Callee-Specified Telephone Calls
Tomoyuki ASANO Tsutomu MATSUMOTO Hideki IMAI

PAPER

Vol:
E76-A No:1
Page(s):
88-95
This paper presents two methods for securely realizing caller-authenticated and callee-specified calls over telecommunication networks with terminals that accept IC cards having KPS-based cryptographic functions. In the proposed protocols, users can verify that the partner is the proper owner of a certain ID or a certain pen name. Users' privacy is protected even if they do the caller-authenticated and callee-specified calls and do not pay their telephone charge in advance.
Predicting the Next Utterance Linguistic Expressions Using Contextual Information
Hitoshi IIDA Takayuhi YAMAOKA Hidekazu ARITA

PAPER

Vol:
E76-D No:1
Page(s):
62-73
A context-sensitive method to predict linguistic expressions in the next utterance in inquiry dialogues is proposed. First, information of the next utterance, the utterance type, the main action and the discourse entities, can be grasped using a dialogue interpretation model. Secondly, focusing in particular on dialogue situations in context, a domain-dependent knowledge-base for literal usage of both noun phrases and verb phrases is developed. Finally, a strategy to make a set of linguistic expressions which are derived from semantic concepts consisting of appropriate expressions can be used to select the correct candidate from the speech recognition output. In this paper, some of the processes are particularly examined in which sets of polite expressions, vocatives, compound nominal phrases, verbal phrases, and intention expressions, which are common in telephone inquiry dialogue, are created.
A Real-Time Speech Dialogue System Using Spontaneous Speech Understanding
Yoichi TAKEBAYASHI Hiroyuki TSUBOI Hiroshi KANAZAWA Yoichi SADAMOTO Hideki HASHIMOTO Hideaki SHINCHI

PAPER

Vol:
E76-D No:1
Page(s):
112-120
This paper describes a task-oriented speech dialogue system based on spontaneous speech understanding and response generation (TOSBURG). The system has been developed for a fast food ordering task using speaker-independent keyword-based spontaneous speech understanding. Its purpose being to understand the user's intention from spontaneous speech, the system consists of a noise-robust keyword-spotter, a semantic keyword lattice parser, a user-initiated dialogue manager and a multimodal response generator. After noise immunity keyword-spotting is performed, the spotted keyword candidates are analyzed by a keyword lattice parser to extract the semantic content of the input speech. Then, referring to the dialogue history and context, the dialogue manager interprets the semantic content of the input speech. In cases where the interpretation is ambiguous or uncertain, the dialogue manager invites the user to confirm verbally the system's understanding of the speech input. The system's response to the user throughout the dialogue is multimodal; that is, several modes of communication (synthesized speech, text, animated facial expressions and ordered food items) are used to convey the system's state to the user. The object here is to emulate the multimodal interaction that occurs between humans, and so achieve more natural and efficient human-computer interaction. The real-time dialogue system has been constructed using two general purpose workstations and four DSP accelerators (520MFLOPS). Experimental results have shown the effectiveness of the newly developed speech dialogue system.
Design and Creation of Speech and Text Corpora of Dialogue
Satoru HAYAMIZU Shuichi ITAHASHI Tetsunori KOBAYASHI Toshiyuki TAKEZAWA

INVITED PAPER

Vol:
E76-D No:1
Page(s):
17-22
This paper describes issues on dialogue corpora for speech and natural language research. Speech and text corpora of dialogue have recently become more important for the development and the evaluation of speech and text-based dialogue systems. However, the design and the construction of dialogue corpora themselves still remain research issues and many problems have not yet been clarified. Many kinds of corpus are necessary to study various aspects of dialogues. On the other hand, each corpus should contain a certain quantity for each purpose in order to make it statistically meaningful. This paper presents the issues related with design and creation of dialogue corpora; the selection of a task domain, transcription conventions, situations for the collection, syntactic and semantic ill-formedness, and politeness. Future directions for dialogue corpora creation are also discussed.
A Unification-Based Japanese Parser for Speech-to-Speech Translation
Masaaki NAGATA Tsuyoshi MORIMOTO

PAPER

Vol:
E76-D No:1
Page(s):
51-61
A unification-based Japanese parser has been implemented for an experimental Japanese-to-English spoken language translation system (SL-TRANS). The parser consists of a unification-based spoken-style Japanese grammar and an active chart parser. The grammar handles the syntactic, semantic, and pragmatic constraints in an integrated fashion using HPSG-based framework in order to cope with speech recognition errors. The parser takes multiple sentential candidates from the HMM-LR speech recognizer, and produces a semantic representation associated with the best scoring parse based on acoustic and linguistic plausibility. The unification-based parser has been tested using 12 dialogues in the conference registration domain, which include 261 sentences uttered by one male speaker. The sentence recognition accuracy of the underlying speech recognizer is 73.6% for the top candidate, and 83.5% for the top three candidates, where the test-set perplexity of the CFG grammar is 65. By ruling out erroneous speech recognition results using various linguistic constraints, the parser improves the sentence recognition accuracy up to 81.6% for the top candidate, and 85.8% for the top three candidates. From the experiment result, we found that the combination of syntactic restriction, selectional restriction and coordinate structure restriction can provide a sufficient restriction to rule out the recognition errors between case-marking particles with the same vowel, which are the type of errors most likely to occur. However, we also found that it is necessary to use pragmatic information, such as topic, presupposition, and discourse structure, to rule out the recognition errors involved with topicalizing particles and sentence final particles.
Prospects for Advanced Spoken Dialogue Processing
Hitoshi IIDA

INVITED PAPER

Vol:
E76-D No:1
Page(s):
2-8
This paper discusses the problems facing spoken dialogue processing and the prospects for future improvements. Research on elemental topics like speech recognition, speech synthesis and language understanding has led to improvements in the accuracy and sophistication of each area of study. First, in order to handle a spoken dialogue, we show the necessity for information exchanges between each area of processing as seen through the analysis of spoken dialogue characteristics. Second, we discuss how to integrate those processes and show that the memory-basad approach to spontaneous speech interpretation offers a solution to the problem of process integration. The key to this is setting up a mental state affected by both speech and linguistic information. Finally, we discuss how those mental states are structured and a method for constructing them.
Task Adaptation in Syllable Trigram Models for Continuous Speech Recognition
Sho-ichi MATSUNAGA Tomokazu YAMADA Kiyohiro SHIKANO

PAPER

Vol:
E76-D No:1
Page(s):
38-43
In speech recognition systeme dealing with unlimited vocabulary and based on stochastic language models, when the target recognition task is changed, recognition performance decreases because the language model is no longer appropriate. This paper describes two approaches for adapting a specific/general syllable trigram model to a new task. One uses a amall amount of text data similar to the target task, and the other uses supervised learning using the most recent input phrases and similar text. In this paper, these adaptation methods are called preliminary learning" and successive learning", respectively. These adaptation are evaluated using syllable perplexity and phrase recognition rates. The perplexity was reduced from 24.5 to 14.3 for the adaptation using 1000 phrases of similar text by preliminary learning, and was reduced to 12.1 using 1000 phrases including the 100 most recent phrases by successive learning. The recognition rates were also improved from 42.3% to 51.3% and 52.9%, respectively. Text similarity for the approaches is also studied in this paper.
A Spoken Dialog System with Verification and Clarification Queries
Mikio YAMAMOTO Satoshi KOBAYASHI Yuji MORIYA Seiichi NAKAGAWA

PAPER

Vol:
E76-D No:1
Page(s):
84-94
We studied the manner of clarification and verification in real dialogs and developed a spoken dialog system that can cope with the disambiguation of meanings of user input utterances. We analyzed content, query types and responses of human clarification queries. In human-human communications, ten percent of all sentences are concerned with meaning clarification. Therefore, in human-machine communications, we believe it is important that the machine verifies ambiguities occurring in dialog processing. We propose an architecture for a dialog system with this capability. Also, we have investigated the source of ambiguities in dialog processing and methods of dialog clarification for each part of the dialog system.
MASCOTS II: A Dialog Manager in General Interface for Speech Input and Output
Yoichi YAMASHITA Hideaki YOSHIDA Takashi HIRAMATSU Yasuo NOMURA Riichiro MIZOGUCHI

PAPER

Vol:
E76-D No:1
Page(s):
74-83
This paper describes a general interface system for speech input and output and a dialog management system, MASCOTS, which is a component of the interface system. The authors designed this interface system, paying attention to its generality; that is, it is not dependent on the problem-solving system it is connected to. The previous version of MASCOTS dealt with the dialog processing only for the speech input based on the SR-plans. We extend MASCOTS to cover the speech output to the user. The revised version of MASCOTS, named MASCOTS II, makes use of topic information given by the topic packet network (TPN) which models the topic transitions in dialogs. Input and output messages are described with the concept representation based on the case structure. For the speech input, prediction of user's utterance is focused and enhanced by using the TPN. The TPN compensates for the shortages of the SR-plan and improves the accuracy of prediction as to stimulus utterances of the user. As the dialog processing in the speech output, MASCOTS II extracts emphatic words and restores missing words to the output message if necessary, e.g., in order to notify the results of speech recognition. The basic mechanisms of the SR-plan and the TPN are shared between the speech input and output processes in MASCOTS II.
How Might One Comfortably Converse with a Machine ?
Yasuhisa NIIMI

INVITED PAPER

Vol:
E76-D No:1
Page(s):
9-16
Progress of speech recognition based on the hidden Markov model has made it possible to realize man-machine dialogue systems capable of operating in real time. In spite of considerable effort, however, few systems have been successfully developed because of the lack of appropriate dialogue models. This paper reports on some of technology necessary to develop a dialogue system with which one can converse comfortably. The emphasis is placed on the following three points: how a human converses with a machine; how errors of speech recognition can be recovered through conversation; and what it means for a machine to be cooperative. We examine the first problem by investigating dialogues between human speakers, and dialogues between a human speaker and a simulated machine. As a consideration in the design of dialogue control, we discuss the relation between efficiency and cooperativeness of dialogue, the method for confirming what the machine has recognized, and dynamic adaptation of the machine. Thirdly, we review the research on the friendliness of a natural language interface, mainly concerning the exchange of initiative, corrective and suggestive answers, and indirect questions. Lastly, we describe briefly the current state of the art in speech recognition and synthesis, and suggest what should be done for acceptance of spontaneous speech and production of a voice suitable to the output of a dialogue system.
A Dialogue Processing System for Speech Response with High Adaptability to Dialogue Topics
Yasuharu ASANO Keikichi HIROSE

PAPER

Vol:
E76-D No:1
Page(s):
95-105
A system is constructed for the processing of question-answer dialogue as a subsystem of the speech response device. In order to increase the adaptability to dialogue topics, rules for dialogue processing are classified into three groups; universal rules, topic-dependent rules and task-dependent rules, and example-based description is adopted for the second group. The system is disigned to operate only with information on the content words of the user input. As for speech synthesis, a function is included in the system to control the focal position. Introduction and guidance of ski areas are adopted as the dialogue domain, and a prototype system is realized on a computer. The dialogue example performed with the prototype indicates the propriety of our method for dialogue processing.
System Design, Data Collection and Evaluation of a Speech Dialogue System
Katunobu ITOU Satoru HAYAMIZU Kazuyo TANAKA Hozumi TANAKA

PAPER

Vol:
E76-D No:1
Page(s):
121-127
This paper describes design issues of a speech dialogue system, the evaluation of the system, and the data collection of spontaneous speech in a transportation guidance domain. As it is difficult to collect spontaneous speech and to use a real system for the collection and evaluation, the phenomena related with dialogues have not been quantitatively clarified yet. The authors constructed a speech dialogue system which operates in almost real time, with acceptable recognition accuracy and flexible dialogue control. The system was used for spontaneous speech collection in a transportation guidance domain. The system performance evaluated in the domain is the understanding rate of 84.2% for the utterances within the predefined grammar and the lexicon. Also some statistics of the spontaneous speech collected are given.

4001-4020hit(4073hit)

Keyword Search Result

[Keyword] EE(4073hit)

Effects of Link Communication Time on Optimal Load Balancing in Tree Hierarchy Network Configurations

Some Properties of Kleene-Stone Logic Functions and Their Canonical Disjunctive Form

Hybrid Photonic-Microwave Systems and Devices

Speaker Adaptation Based on Vector Field Smoothing

A Characterization of Kleene-Stone Logic Functions

Three Different LR Parsing Algorithms for Phoneme-Context-Dependent HMM-Based Continuous Speech Recognition

A Linguistic Procedure for an Extension Number Guidance System

LR Parsing with a Category Reachability Test Applied to Speech Recognition

Methods to Securely Realize Caller-Authenticated and Callee-Specified Telephone Calls

Predicting the Next Utterance Linguistic Expressions Using Contextual Information

A Real-Time Speech Dialogue System Using Spontaneous Speech Understanding

Design and Creation of Speech and Text Corpora of Dialogue

A Unification-Based Japanese Parser for Speech-to-Speech Translation

Prospects for Advanced Spoken Dialogue Processing

Task Adaptation in Syllable Trigram Models for Continuous Speech Recognition

A Spoken Dialog System with Verification and Clarification Queries

MASCOTS II: A Dialog Manager in General Interface for Speech Input and Output

How Might One Comfortably Converse with a Machine ?

A Dialogue Processing System for Speech Response with High Adaptability to Dialogue Topics

System Design, Data Collection and Evaluation of a Speech Dialogue System

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles