IEICE global.ieice.org Site

Keyword Search Result

[Keyword] language(282hit)

221-240hit(282hit)

Content-Based Video Indexing and Retrieval-- A Natural Language Approach--
Yeun-Bae KIM Masahiro SHIBATA

PAPER

Vol:
E79-D No:6
Page(s):
695-705
This paper describes methods in which natural language is used to describe video contents, knowledge of which is needed for intelligent video manipulation. The content encoded by natural language is extracted by a language analyzer in the form of subject-centered dependency structures which is a language-oriented structure, and is combined in an incremental way into a single structure called a multi-path index tree. Content descriptors and their inter-relations are extracted from the index tree in order to provide a high speed retrieval and flexibility. The content-based video index is represented in a two-dimensional structure where in the descriptors are mapped onto a component axis and temporal references (i.e., video segments aligned to the descriptors) are mapped onto a time axis. We implemented an experimental image retrieval systems to illustrate the proposed index structure 1) has superior retrieval capabilities compare to those used in conventional methods, 2) can be generated by an automated procedure, and 3) has a compact and flexible structure that is easily expandable, making an integration with vision processing possible.
Note on Domain/Surface Tree Languages of t-PDTT's
Katsunori YAMASAKI

PAPER-Automata,Languages and Theory of Computing

Vol:
E79-D No:6
Page(s):
829-839
String grammars (languages) have been extensively studied from the 60's. On the other hand, the transformational grammar, proposed by N. Chomsky, contains the transformation from the set of derivation trees of a context-free language to the surface set. Here the grammar regarded a tree as an input sentence to some transducer. After that from the second half of 60's, the studies of acceptors, transducers, and so on, whose inputs are trees, have been done extensively. Recently pushdown tree automata were introduced, and their fundamental and some other various properties were investigated [12],[13],[22]-[26]. Furthermore a top-down pushdown tree transducer (t-PDTT for short), which is an extension of a top-down pushdown automaton (t-PDTA for short), was introduced and its fundamental properties were investigated [27]. In this paper we focus on t-PDTT, linear t-PDTT, t-FST (top-down finite state transducer), and t-PDTA. The main subjects discussed here are as follows: (1) the class of domain/surface tree languages of t-PDTT properly contains the class of tree languages accepted by t-PDTA, (2) the class of domain/surface tree languages of linear t-PDTT's coincides with the class of tree languages accepted by t-PDTA's, (3) the class of tree languages accepted by t-PDTA's properly contains the class of surface tree languages of t-FST's.
Performance Evaluation of Neural Network Hardware Using Time-Shared Bus and Integer Representation Architecture
Moritoshi YASUNAGA Tatsuo OCHIAI

PAPER-Bio-Cybernetics and Neurocomputing

Vol:
E79-D No:6
Page(s):
888-896
Neural network hardware using time-shared bus and integer representation architecture has already been fabricated and reported from the design viewpoint. However, nothing related to performance evaluation of hardware has yet been presented. Computation-speed, scalability and learning accuracy of hardware are evaluated theoretically and experimentally using a Back Propagation (BP) algorithm. In addition, a mirror-weight assignment technique is proposed for high-speed computation in the BP. NETTalk, an English-pronunciation-reasoning task, has been chosen as the target application for the BP. In the experiment, recently-developed neuro-hardware based on the above architecture and its parallel programming language are used. An outline of the language is described along with BP programming. Mirror-weight assignment allows maximum speed at 55.0 MCUPS (Million Connections Updated Per Second) using 256 neurons in the hidden-layer (numbers of neurons in input-and output-layers are fixed at 203 and 26 respectively in NETTalk). In addition, if scalability is defined as a function of the number of neurons in the hidden-layer, the machine retains high scalability at 0.5 if such a maximum speed needs to be used. No degradation in learning accuracy occurs when experimental results computed using the neuro-hardware are compared with those obtained by floating-point representation architecture (workstation). The experiment indicates that the present integer representational design of the neuro-hardware is sufficient for NETTalk. Performance has been evaluated theoretically. For evaluation purposes, it is assumed that most of the total execution-time is taken up by bus cycles. On the basis of this assumption, an analytical model of computation-speed and scalability is proposed. Analytical predictions agreed well with experimental results.
Visualization of Temporal and Spatial Information in Natural Language Descriptions
Hiromi BABA Tsukasa NOMA Naoyuki OKADA

PAPER-Image Processing,Computer Graphics and Pattern Recognition

Vol:
E79-D No:5
Page(s):
591-599
This paper discusses visualization of temporal and spatial information in natural language descriptions (NLDs), focusing on the translation process of intermediate representations of NLDs to proper scenarios" and environments" for animations. First, the intermediate representations are shown according to the idea of actors. Actors and non-actors are represented as primitives of objects, whereas actions as those of events. Temporal and spatial constraints by a given NLD text are imposed upon the primitives. Then, the representations containing unknown temporal or spatial parameters --time and coordinates-- are translated into evaluation functions, where the unlikelihood of the deviations from the predicted temporal or spatial relations are estimated. Particularly, the functions concerning actor's movements contain both temporal and spatial parameters. Next, the sum of all the evaluation functions is minimized by a nonlinear optimization method. Thus, the most proper actors' time-table, or scenario, and non-actors' location-table, or environment, for visualization are obtained. Implementation and experiments show that both temporal and spatial information in NLDs are well connected through actors' movements for visualization.
Eliminating Unnecessary Items from the One-Pass Evaluation of Attribute Grammars
Yoshimichi WATANABE Takehiro TOKUDA

PAPER-Software Theory

Vol:
E79-D No:4
Page(s):
312-320
We present two efficient attribute evaluator construction methods for a wide subclass of L-attributed grammars by enumeration of attributed items during one-pass bottom-up parsing. We have already proposed a construction method of a parser/evaluator for the subclass of L-attributed grammar. However the evaluator produced by our previous method uses a great number of attributed items to evaluate all attributes of a given input string. In this paper we propose two generalized methods to reduce the number of attributed itmes used in attribute evaluation. Our methods allow us to evaluate all attributes taking advantage of the use of available lookahead information.
Succeeding Word Prediction for Speech Recognition Based on Stochastic Language Model
Min ZHOU Seiichi NAKAGAWA

PAPER-Speech Processing and Acoustics

Vol:
E79-D No:4
Page(s):
333-342
For the purpose of automatic speech recognition, language models (LMs) are used to predict possible succeeding words for a given partial word sequence and thereby to reduce the search space. In this paper several kinds of stochastic language models (SLMs) are evaluated-bigram, trigram, hidden Markov model (HMM), bigram-HMM, stochastic context-free grammar (SCFG) and hand-written Bunsetsu Grammar. To compare the predictive power of these SLMs, the evaluation was conducted from two points of views: (1) relationship between the number of model parameters and entropy, (2) predictive rate of succeeding part of speech (POS) and succeeding word. We propose a new type of bigram-HMM and compare it with the other models. Two kinds of approximations are tried and examined through experiments. Results based on both of English Brown-Corpus and Japanese ATR dialog database showed that the extended bigram-HMM had better performance than the others and was more suitable to be a language model.
Dyck Reductions of Minimal Linear Languages Yield the Full Class of Recursively Enumerable Languages
Sadaki HIROSE Satoshi OKAWA

LETTER-Automata,Languages and Theory of Computing

Vol:
E79-D No:2
Page(s):
161-164
In this paper, we give a direct proof of the result of Latteux and Turakainen that the full class of recursively enumerable languages can be obtained from minimal linear languages (which are generated by linear context-free grammars with only one nonterminal symbol) by Dyck reductions (which reduce pairs of parentheses to the empty word).
Continuous Speech Recognition Using a Combination of Syntactic Constraints and Dependency Relationships
Tsuyoshi MORIMOTO

PAPER-Speech Processing and Acoustics

Vol:
E79-D No:1
Page(s):
54-62
This paper proposes a Japanese continuous speech recognition mechanism in which a full-sentence-level context-free-grammar (CFG) and one kind of semantic constraint called dependency relationships between two bunsetsu (a kind of phrase) in Japanese" are used during speech recognition in an integrated way. Each dependency relationship is a modification relationship between two bunsetsu; these relationships include the case-frame relationship of a noun bunsetsu to a predicate bunsetsu, or adnominal modification relationships such as a noun bunsetsu to a noun bunsetsu. To suppress the processing overhead caused by using relationships of this type during speech recognition, no rigorous semantic analysis is performed. Instead, a simple matching with examples" approach is adopted. An experiment was carried out and results were compared with a case employing only CFG constraints. They show that the speech recognition accuracy is improved and that the overhead is small enough.
Conformance Test of a Logic Synthesis System to the Standard HDL UDL/I
Satoshi YOKOTA Hiroyuki KANBARA

PAPER

Vol:
E78-A No:12
Page(s):
1742-1748
This paper presents testing methods for a logic synthesis system which supports the standard HDL UDL/I, focusing on conformance test to the language specification. Conformance test, to prove that the system completely satisfies the language specification, is very important to provide a unified design environment for users of CAD tools which support the language. The basic idea of our testing methods is using a logic simulator, due to a limited schedule for the test execution. We classified the test into two: unit test and integration test. Unit test is a test of each individual functionality of the system, and integration test is a test to prove that the whole system works correctly and satisfies the language specification. And we prepared and used various kinds of test data. One of them is the UDL/I Test Suite and it was also utilized to observe progress of language coverage by the system during the test execution.
Validation of UDL/I Test Suites and UDL/I Simulation/Synthesis Environment
Hiroyuki KANBARA Satoshi YOKOTA

PAPER

Vol:
E78-A No:12
Page(s):
1749-1754
UDL/I test suites and UDL/I Simulation/Synthesis Environment had been developed separately in parallel. Both were designed from syntax and semantics definition of UDL/I Language Reference Manual. Through test of the UDL/I Simulation/Synthesis Environment using the UDL/I test suites, quality of the test suites and the environment had been improved. Finally all the testing result matched with expected one. It was validated that both the test suites and the environment followed UDL/I language specification.
Co-database Approach to Database Interoperability
Athman BOUGUETTAYA Stephen MILLINER

PAPER-Interoperability

Vol:
E78-D No:11
Page(s):
1388-1395
The evolution of heterogeneous and autonomous databases research has been slow compared to other areas of research. Part of the problem resides in the fact that bridging data semantics has been a difficult problem. Sharing data among disparate databases has mostly been achieved through some form of manual schema integration. The complexity of making autonomous heterogeneous databases smoothly interoperate is dependent on addressing two major issues. The first issue to address is what adequate levels of autonomy databases are guaranteed to keep. The second issue to address is what overhead cost is required to bridge database heterogeneity. The complexity of these two issues are closely dependent on how scalable multidatabase systems are. In this paper we introduce the FINDIT architecture which uses information meta-types to provide a basis for such an organization and, consequently, provides a platform for inter-operability. A distinction is made between the information and inter-node relationship spaces to achieve scalability. Tassili language primitives are used for the incremental building of dynamic inter-node relationships based upon usage considerations.
A Requirement Description Approach in Natural Language Based on Communication Service Knowledge
Yoshizumi KOBAYASHI Tadashi OHTA Nobuyoshi TERASHIMA

PAPER-Applications

Vol:
E78-D No:9
Page(s):
1156-1163
This paper proposes a requirement description and elicitation approach for communication services. Requirements are described in natural language, refined with a knowledge base, and converted to a formal language for program generation. A model for communication services is made as a set of three items: terminal state, terminal action and the response of the communication system to the action. This set, in turn, corresponds to natural language syntax that expresses two conditions (terminal state and action) and their result. These conditions and result are expressed as a sequence of simple sentences that describe the relationship between a terminal and a communication system. Thus, by defining such a description style to reflect the features of communication services, it should be possible to achieve both a high level of description and mechanical processing capabilities at the same time. However, requirement descriptions usually include omission and inconsistency. This problem cannot be solved by merely introducing natural language for the descriptions. Knowledge about the target domain of requirements is needed to resolve it. This paper reports on a knowledge base that stores constraints existing between conditions and results in communication services. This knowledge base is shown to be effective in supplementing omissions and resolving inconsistency. This paper also presents a technique for converting the elicited requirements in natural language to descriptions in a formal language that can be used to generate a program.
A Polynomial-Time Algorithm for Checking the Inclusion for Real-Time Deterministic Restricted One-Counter Automata Which Accept by Final State
Ken HIGUCHI Mitsuo WAKATSUKI Etsuji TOMITA

PAPER-Automata, Languages and Theory of Computing

Vol:
E78-D No:8
Page(s):
939-950
A deterministic pushdown automaton (dpda) having just one stack symbol is called a deterministic restricted one-counter automaton (droca). A deterministic one-counter automaton (doca) is a dpda having only one stack symbol, with the exception of a bottom-of-stack marker. The class of languages accepted by droca's which accept by final state is a proper subclass of the class of languages accepted by doca's. Valiant has proved the decidability of the equivalence problem for doca's and the undecidability of the inclusion problem for doca's. Hence the decidability of the equivalence problem for droca's is obvious. In this paper, we evaluate the upper bound of the length of the shortest input string (witness) that disproves the inclusion for a pair of real-time droca's which accept by final state, and present a new direct branching algorithm for checking the inclusion for a pair of languages accepted by these droca's. Then we show that the worst-case time complexity of our algorithm is polynomial in the size of these droca's.
Automatic Language Identification Using Sequential Information of Phonemes
Takayuki ARAI

PAPER

Vol:
E78-D No:6
Page(s):
705-711
In this paper approaches to language identification based on the sequential information of phonemes are described. These approaches assume that each language can be identified from its own phoneme structure, or phonotactics. To extract this phoneme structure, we use phoneme classifiers and grammars for each language. The phoneme classifier for each language is implemented as a multi-layer perceptron trained on quasi-phonetic hand-labeled transcriptions. After training the phoneme classifiers, the grammars for each language are calculated as a set of transition probabilities for each phoneme pair. Because of the interest in automatic language identification for worldwide voice communication, we decided to use telephone speech for this study. The data for this study were drawn from the OGI (Oregon Graduate Institute)-TS (telephone speech) corpus, a standard corpus for this type of research. To investigate the basic issues of this approach, two languages, Japanese and English, were selected. The language classification algorithms are based on Viterbi search constrained by a bigram grammar and by minimum and maximum durations. Using a phoneme classifier trained only on English phonemes, we achieved 81.1% accuracy. We achieved 79.3% accuracy using a phoneme classifier trained on Japanese phonemes. Using both the English and the Japanese phoneme classifiers together, we obtained our best result: 83.3%. Our results were comparable to those obtained by other methods such as that based on the hidden Markov model.
Speech Recognition Using Function-Word N-Grams and Content-Word N-Grams
Ryosuke ISOTANI Shoichi MATSUNAGA Shigeki SAGAYAMA

PAPER

Vol:
E78-D No:6
Page(s):
692-697
This paper proposes a new stochastic language model for speech recognition based on function-word N-grams and content-word N-grams. The conventional word N-gram models are effective for speech recognition, but they represent only local constraints within a few successive words and lack the ability to capture global syntactic or semantic relationships between words. To represent more global constraints, the proposed language model gives the N-gram probabilities of word sequences, with attention given only to function words or to content words. The sequences of function words and of content words are expected to represent syntactic and semantic constraints, respectively. Probabilities of function-word bigrams and content-word bigrams were estimated from a 10,000-sentence text database, and analysis using information theoretic measure showed that expected constraints were extracted appropriately. As an application of this model to speech recognition, a post-processor was constructed to select the optimum sentence candidate from a phrase lattice obtained by a phrase recognition system. The phrase candidate sequence with the highest total acoustic and linguistic score was sought by dynamic programming. The results of experiments carried out on the utterances of 12 speakers showed that the proposed method is more accurate than a CFG-based method, thus demonstrating its effectiveness in improving speech recognition performance.
Cooperative Spoken Dialogue Model Using Bayesian Network and Event Hierarchy
Masahiro ARAKI Shuji DOSHITA

PAPER

Vol:
E78-D No:6
Page(s):
629-635
In this paper, we propose a dialogue model that reflects two important aspects of spoken dialogue system: to be robust' and to be cooperative'. For this purpose, our model has two main inference spaces: Conversational Space (CS) and Problem Solving Space (PSS). CS is a kind of dynamic Bayesian network that represents a meaning of utterance and general dialogue rule. Robust' aspect is treated in CS. PSS is a network so called Event Hierarchy that represents the structure of task domain problems. Cooperative' aspect is mainly treated in PSS. In constructing CS and making inference on PSS, system's process, from meaning understanding through response generation, is modeled by dividing into five steps. These steps are (1) meaning understanding, (2) intention understanding, (3) communicative effect, (4) reaction generation, and (5) response generation. Meaning understanding step constructs CS and response generation step composes a surface expression of system's response from the part of CS. Intention understanding step makes correspondence utterance type in CS with action in PSS. Reaction generation step selects a cooperative reaction in PSS and expands a reaction to utterance type of CS. The status of problem solving and declared user's preference are recorded in mental state by communicative effect step. Then from our point of view, cooperative problem solving dialogue is regarded as a process of constructing CS and achieving goal in PSS through these five steps.
A Polynomial-Time Algorithm for Checking the Inclusion for Strict Deterministic Restricted One-Counter Automata
Ken HIGUCHI Etsuji TOMITA Mitsuo WAKATSUKI

PAPER-Automata, Languages and Theory of Computing

Vol:
E78-D No:4
Page(s):
305-313
A deterministic pushdown automaton (dpda) having just one stack symbol is called a deterministic restricted one-counter automaton (droca). When it accepts by empty stack, it is called strict. A deterministic one-counter automaton (doca) is a dpda having only one stack symbol, with the exception of a bottom-of-stack marker. The class of languages accepted by strict droca's is a subclass of the class of languages accepted by doca's. Valiant has proved the decidability of the equivalence problem for doca's and the undecidability of the inclusion problem for doca's. Hence the decidablity of the equivalence problem for strict droca's is obvious. In this paper, we present a new direct branching algorithm for checking the inclusion for a pair of languages accepted by strict droca's. Then we show that the worst-case time complexity of our algorithm is polynomial with respect to these automata.
Test Synthesis from Behavioral Description Based on Data Transfer Analysis
Mitsuteru YUKISHITA Kiyoshi OGURI Tsukasa KAWAOKA

LETTER

Vol:
E78-D No:3
Page(s):
248-251
We developed a new test-synthesis that operates method based on data transfer analysis at the language level. Using this method, an efficient scan path is inserted to generate test data for the sequential circuit by using only a test generation tool for the combinatorial circuit. We have applied this method successfully to the behavior, logic, and test design of a 32-bit, RISC-type processor. The size of the synthesized circuit without test synthesis is 23,407 gates; the size with test synthesis is 24,811 gates. This is an increase of only a little over 6%.
An Extended Centering Mechanism for Interpreting Pronouns and Zero-Pronouns
Shingo TAKADA Norihisa DOI

PAPER-Artificial Intelligence and Cognitive Science

Vol:
E78-D No:1
Page(s):
58-67
Zero-pronouns and overt pronouns occur frequently in Japanese text. These must be interpreted by recognizing their antecedents to properly understand' a piece of discourse. The notion of centering" has been used to help in the interpretation process for intersentential anaphors. This is based on the premise that in a piece of discourse, some members have a greater amount of attention put on it than other members. In Japanese, the zero-pronoun is said to have the greatest amount of attention put on it. But, when there are more than one zero-pronoun in a sentence, only one of them would be accountable using centering. Overt pronouns and any other zero-pronouns may as well have appeared as ordinary' noun phrases. In this paper, the notion of centering has been extended so that these can also be interpreted. Basically, zero-pronouns and overt pronouns are treated as being more centered" in the discourse than other ordinary' noun phrases. They are put in an ordered list called the Center List. Any other noun phrases appearing in a sentence are put in another list called the Possible Center List. Noun phrases within both lists are ordered according to their degrees of salience. To see the effect of our approach, it was implemented in a simple system with minimal constraints and evaluated. The result showed that when the antecedent is in either the Center List or the Possible Center List, 80% of all zero-pronouns and overt pronouns were properly interpreted.
A Segmentation Method for Sign Language Recognition
Eiji OHIRA Hirohiko SAGAWA Tomoko SAKIYAMA Masaru OHKI

PAPER-Image Processing, Computer Graphics and Pattern Recognition

Vol:
E78-D No:1
Page(s):
49-57
This paper discusses sign word segmentation methods and extraction of motion features for sign language recognition. Because Japanese sign language grammar has not yet been systematized and because sign language does not have prepositions, it is more difficult to use grammar and meaning information in sign language recognition than in speech recognition. Segmentation significantly improves recognition efficiency, so we propose a method of dividing sign language based on rests and on the envelope and minimum of motion speed. The sign unit corresponding to a sign word is detected based on the divided position using such features as the change of hand shape. Experiments confirmed the validity of word segmentation of sign language based on the temporal structure of motion.

221-240hit(282hit)

Keyword Search Result

[Keyword] language(282hit)

Content-Based Video Indexing and Retrieval-- A Natural Language Approach--

Note on Domain/Surface Tree Languages of t-PDTT's

Performance Evaluation of Neural Network Hardware Using Time-Shared Bus and Integer Representation Architecture

Visualization of Temporal and Spatial Information in Natural Language Descriptions

Eliminating Unnecessary Items from the One-Pass Evaluation of Attribute Grammars

Succeeding Word Prediction for Speech Recognition Based on Stochastic Language Model

Dyck Reductions of Minimal Linear Languages Yield the Full Class of Recursively Enumerable Languages

Continuous Speech Recognition Using a Combination of Syntactic Constraints and Dependency Relationships

Conformance Test of a Logic Synthesis System to the Standard HDL UDL/I

Validation of UDL/I Test Suites and UDL/I Simulation/Synthesis Environment

Co-database Approach to Database Interoperability

A Requirement Description Approach in Natural Language Based on Communication Service Knowledge

A Polynomial-Time Algorithm for Checking the Inclusion for Real-Time Deterministic Restricted One-Counter Automata Which Accept by Final State

Automatic Language Identification Using Sequential Information of Phonemes

Speech Recognition Using Function-Word N-Grams and Content-Word N-Grams

Cooperative Spoken Dialogue Model Using Bayesian Network and Event Hierarchy

A Polynomial-Time Algorithm for Checking the Inclusion for Strict Deterministic Restricted One-Counter Automata

Test Synthesis from Behavioral Description Based on Data Transfer Analysis

An Extended Centering Mechanism for Interpreting Pronouns and Zero-Pronouns

A Segmentation Method for Sign Language Recognition

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles