The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] language(282hit)

141-160hit(282hit)

  • Language Model Adaptation Based on PLSA of Topics and Speakers for Automatic Transcription of Panel Discussions

    Yuya AKITA  Tatsuya KAWAHARA  

     
    PAPER-Spoken Language Systems

      Vol:
    E88-D No:3
      Page(s):
    439-445

    Appropriate language modeling is one of the major issues for automatic transcription of spontaneous speech. We propose an adaptation method for statistical language models based on both topic and speaker characteristics. This approach is applied for automatic transcription of meetings and panel discussions, in which multiple participants speak on a given topic in their own speaking style. A baseline language model is a mixture of two models, which are trained with different corpora covering various topics and speakers, respectively. Then, probabilistic latent semantic analysis (PLSA) is performed on the same respective corpora and the initial ASR result to provide two sets of unigram probabilities conditioned on input speech, with regard to topics and speaker characteristics, respectively. Finally, the baseline model is adapted by scaling N-gram probabilities with these unigram probabilities. For speaker adaptation purpose, we make use of a portion of the Corpus of Spontaneous Japanese (CSJ) in which a large number of speakers gave talks for given topics. Experimental evaluation with real discussions showed that both topic and speaker adaptation reduced test-set perplexity, and in total, an average reduction rate of 8.5% was obtained. Furthermore, improvement on word accuracy was also achieved by the proposed adaptation method.

  • A Probabilistic Sentence Reduction Using Maximum Entropy Model

    Minh LE NGUYEN  Masaru FUKUSHI  Susumu HORIGUCHI  

     
    PAPER-Natural Language Processing

      Vol:
    E88-D No:2
      Page(s):
    278-288

    This paper describes a new probabilistic sentence reduction method using maximum entropy model. In contrast to previous methods, the proposed method has the ability to produce multiple best results for a given sentence, which is useful in text summarization applications. Experimental results show that the proposed method improves on earlier methods in both accuracy and computation time.

  • Real-Time Recognition of Cyclic Strings by One-Way and Two-Way Cellular Automata

    Katsuhiko NAKAMURA  

     
    PAPER

      Vol:
    E88-D No:1
      Page(s):
    65-71

    This paper discusses real-time language recognition by 1-dimensional one-way cellular automata (OCAs) and two-way cellular automata (CAs), focusing on limitations of the parallel computation power. To clarify the limitations, we investigate real-time recognition of cyclic strings of the form uk with u {0,1}+ and k 2. We show a version of pumping lemma for recognizing cyclic strings by OCAs, which can be used for proving that several languages are not recognizable by OCAs in real time. The paper also discusses the real-time language recognition of CAs by prefix and postfix computation, in which every prefix or postfix of an input string is also accepted, if the prefix or postfix is in the language. It is shown that there are languages L Σ+ such that L is not recognizable by OCA in real-time and the reversal of L and the concatenation LΣ* are recognizable by CA in real-time.

  • A Probabilistic Feature-Based Parsing Model for Head-Final Languages

    So-Young PARK  Yong-Jae KWAK  Joon-Ho LIM  Hae-Chang RIM  

     
    LETTER-Natural Language Processing

      Vol:
    E87-D No:12
      Page(s):
    2893-2897

    In this paper, we propose a probabilistic feature-based parsing model for head-final languages, which can lead to an improvement of syntactic disambiguation while reducing the parsing cost related to lexical information. For effective syntactic disambiguation, the proposed parsing model utilizes several useful features such as a syntactic label feature, a content feature, a functional feature, and a size feature. Moreover, it is designed to be suitable for representing word order variation of non-head words in head-final languages. Experimental results show that the proposed parsing model performs better than previous lexicalized parsing models, although it has much less dependence on lexical information.

  • Phonology and Morphology Modeling in a Very Large Vocabulary Hungarian Dictation System

    Mate SZARVAS  Sadaoki FURUI  

     
    PAPER-Speech and Hearing

      Vol:
    E87-D No:12
      Page(s):
    2791-2801

    This article introduces a novel approach to model phonology and morphosyntax in morpheme unit-based speech recognizers. The proposed methods are evaluated on a Hungarian newspaper dictation task that requires modeling over 1 million different word forms. The architecture of the recognition system is based on the weighted finite-state transducer (WFST) paradigm. The vocabulary units used in the system are morpheme-based in order to provide sufficient coverage of the large number of word-forms resulting from affixation and compounding. Besides the basic pronunciation model and the morpheme N-gram language model we evaluate a novel phonology model and the novel stochastic morphosyntactic language model (SMLM). Thanks to the flexible transducer-based architecture of the system, these new components are integrated seamlessly with the basic modules with no need to modify the decoder itself. We compare the phoneme, morpheme, and word error-rates as well as the sizes of the recognition networks in two configurations. In one configuration we use only the N-gram model while in the other we use the combined model. The proposed stochastic morphosyntactic language model decreases the morpheme error rate by between 1.7 and 7.2% relatively when compared to the baseline trigram system. The proposed phonology model reduced the error rate by 8.32%. The morpheme error-rate of the best configuration is 18% and the best word error-rate is 22.3%.

  • Complexity Metrics for Software Architectures

    Jianjun ZHAO  

     
    LETTER-Software Engineering

      Vol:
    E87-D No:8
      Page(s):
    2152-2156

    A large body of research in the measurement of software complexity at code level has been conducted, but little effort has been made to measure the architectural-level complexity of a software system. In this paper, we propose some architectural-level metrics which are appropriate for evaluating the architectural attributes of a software system. The main feature of our approach is to assess the architectural-level complexity of a software system by analyzing its formal architectural specification, and therefore the process of metric computation can be automated completely.

  • Dialogue Languages and Persons with Disabilities

    Akira ICHIKAWA  

     
    INVITED PAPER

      Vol:
    E87-D No:6
      Page(s):
    1312-1319

    Any utterances of dialogue, spoken language or sign language, have functions that enable recipients to achieve real-time and easy understanding and to control conversation smoothly in spite of its volatile characteristics. In this paper, we present evidence of these functions obtained experimentally. Prosody plays a very important role not only in spoken language (aural language) but also in sign language (visual language) and finger braille (tactile language). Skilled users of a language may detect word boundaries in utterances and estimate sentence structure immediately using prosody. The gestures and glances of a recipient may influence the utterances of the sender, leading to amendments of the contents of utterances and smooth exchanges in turn. Individuality and emotion in utterances are also very important aspects of effective communication support systems for persons with disabilities even more so than for those non-disabled persons. The trials described herein are universal in design. Some trials carried out to develop these systems are also reported.

  • "Man-Computer Symbiosis" Revisited: Achieving Natural Communication and Collaboration with Computers

    Neal LESH  Joe MARKS  Charles RICH  Candace L. SIDNER  

     
    INVITED PAPER

      Vol:
    E87-D No:6
      Page(s):
    1290-1298

    In 1960, the famous computer pioneer J.C.R. Licklider described a vision for human-computer interaction that he called "man-computer symbiosis. " Licklider predicted the development of computer software that would allow people "to think in interaction with a computer in the same way that you think with a colleague whose competence supplements your own. " More than 40 years later, one rarely encounters any computer application that comes close to capturing Licklider's notion of human-like communication and collaboration. We echo Licklider by arguing that true symbiotic interaction requires at least the following three elements: a complementary and effective division of labor between human and machine; an explicit representation in the computer of the user's abilities, intentions, and beliefs; and the utilization of nonverbal communication modalities. We illustrate this argument with various research prototypes currently under development at Mitsubishi Electric Research Laboratories (USA).

  • Robotic Hand System for Non-verbal Communication

    Kiyoshi HOSHINO  Ichiro KAWABUCHI  

     
    PAPER

      Vol:
    E87-D No:6
      Page(s):
    1347-1353

    The purpose of this study is to design a humanoid robotic hand system that is capable of conveying feelings and sensitivities by finger movement for the non-verbal communication between men and robots in the near future. In this paper, studies have been made in four steps. First, a small-sized and light-weight robotic hand was developed to be used as the humanoid according to the concept of extracting required minimum motor functions and implementing them to the robot. Second, basic characteristics of the movement were checked by experiments, simple feedforward control mechanism was designed based on velocity control, and a system capable of tracking joint time-series change command with arbitrary pattern input was realized. Third, tracking performances with regard to sinusoidal input with different frequencies were studied for evaluation of the system thus realized, and space- and time-related accuracy were investigated. Fourth, the sign language motions were generated as examples of information transmission by finger movement. A series of results thus obtained indicated that this robotic hand is capable of transmitting information promptly with comparatively high accuracy through the movement.

  • Recognition of Continuous Korean Sign Language Using Gesture Tension Model and Soft Computing Technique

    Jung-Bae KIM  Zeungnam BIEN  

     
    LETTER-Human-computer Interaction

      Vol:
    E87-D No:5
      Page(s):
    1265-1270

    We present a method for recognition of continuous Korean Sign Language (KSL). In the paper, we consider the segmentation problem of a continuous hand motion pattern in KSL. For this, we first extract sign sentences by removing linking gestures between sign sentences. We use a gesture tension model and fuzzy partitioning. Then, each sign sentence is disassembled into a set of elementary motions (EMs) according to its geometric pattern. The hidden Markov model is adopted to classify the segmented individual EMs.

  • A Study on Acoustic Modeling for Speech Recognition of Predominantly Monosyllabic Languages

    Ekkarit MANEENOI  Visarut AHKUPUTRA  Sudaporn LUKSANEEYANAWIN  Somchai JITAPUNKUL  

     
    PAPER

      Vol:
    E87-D No:5
      Page(s):
    1146-1163

    This paper presents a study on acoustic modeling for speech recognition of predominantly monosyllabic languages. Various speech units used in speech recognition systems have been investigated. To evaluate the effectiveness of these acoustic models, the Thai language is selected, since it is a predominantly monosyllabic language and has a complex vowel system. Several experiments have been carried out to find the proper speech unit that can accurately create acoustic model and give a higher recognition rate. Results of recognition rates under different acoustic models are given and compared. In addition, this paper proposes a new speech unit for speech recognition, namely onset-rhyme unit. Two models are proposed-the Phonotactic Onset-Rhyme Model (PORM) and the Contextual Onset-Rhyme Model (CORM). The models comprise a pair of onset and rhyme units, which makes up a syllable. An onset comprises an initial consonant and its transition towards the following vowel. Together with the onset, the rhyme consists of a steady vowel segment and a final consonant. Experimental results show that the onset-rhyme model improves on the efficiency of other speech units. The onset-rhyme model improves on the accuracy of the inter-syllable triphone model by nearly 9.3% and of the context-dependent Initial-Final model by nearly 4.7% for the speaker-dependent systems using only an acoustic model, and 5.6% and 4.5% for the speaker-dependent systems using both acoustic and language model respectively. The results show that the onset-rhyme models attain a high recognition rate. Moreover, they also give more efficiency in terms of system complexity.

  • Some Relations between Watson-Crick Finite Automata and Chomsky Hierarchy

    Sadaki HIROSE  Kunifumi TSUDA  Yasuhiro OGOSHI  Haruhiko KIMURA  

     
    LETTER-Automata and Formal Language Theory

      Vol:
    E87-D No:5
      Page(s):
    1261-1264

    Watson-Crick automata, recently introduced in, are new types of automata in the DNA computing framework, working on tapes which are double stranded sequences of symbols related by a complementarity relation, similar to a DNA molecule. The automata scan separately each of the two strands in a corelated mannar. Some restricted variants of them were also introduced and the relationship between the families of languages recognized by them were investigated in. In this paper, we clarify some relations between the families of languages recognized by the restricted variants of Watson-Crick finite automata and the families in the Chomsky hierarchy.

  • Comparing Reading Techniques for Object-Oriented Design Inspection

    Giedre SABALIAUSKAITE  Shinji KUSUMOTO  Katsuro INOUE  

     
    PAPER-Software Engineering

      Vol:
    E87-D No:4
      Page(s):
    976-984

    For more than twenty-five years software inspections have been considered an effective method for defect detection. Inspections have been investigated through controlled experiments in university environment and industry case studies. However, in most cases software inspections have been used for defect detection in documents of conventional structured development process. Therefore, there is a significant lack of information about how inspections should be applied to Object-Oriented artifacts, such as Object-Oriented code and design diagrams. In addition, extensive work is needed to determine whether some inspection techniques can be more beneficial than others. Most inspection experiments include inspection meetings after individual inspection is completed. However, several researchers suggested that inspection meetings may not be necessary since an insignificant number of new defects are found as a result of inspection meeting. Moreover, inspection meetings have been found to suffer from process loss. This paper presents the findings of a controlled experiment that was conducted to investigate the performance of individual inspectors as well as 3-person teams in Object-Oriented design document inspection. Documents were written using the notation of Unified Modelling Language. Two reading techniques, namely Checklist-based reading (CBR) and Perspective-based reading (PBR), were used during experiment. We found that both techniques are similar with respect to defect detection effectiveness during individual inspection as well as during inspection meetings. Investigating the usefulness of inspection meetings, we found out that the teams that used CBR technique exhibited significantly smaller meeting gains (number of new defect first found during team meeting) than meeting losses (number of defects first identified by an individual but never included into defect list by a team); meanwhile the meeting gains were similar to meeting losses of the teams that used PBR technique. Consequently, CBR 3-person team meetings turned out to be less beneficial than PBR 3-person team meetings.

  • Integrated Development Environment for Knowledge-Based Systems and Its Practical Application

    Keiichi KATAMINE  Masanobu UMEDA  Isao NAGASAWA  Masaaki HASHIMOTO  

     
    PAPER-Knowledge Engineering and Robotics

      Vol:
    E87-D No:4
      Page(s):
    877-885

    The modeling of an application domain and its specific knowledge description language are important for developing knowledge-based systems. A rapid-prototyping approach is suitable for such developments since in this approach the modeling and language development are processed simultaneously. However, programming languages and their supporting environments which are usually used for prototyping are not necessarily adequate for developing practical applications. We have been developing an integrated development environment for knowledge-based systems, which supports all the development phases from the early prototyping phase to final commercial development phase. The environment called INSIDE is based on a Prolog abstract machine, and provides all of the functions required for the development of practical applications in addition to the standard Prolog features. This enables the development of both prototypes and practical applications in the same environment. Moreover, their efficient development and maintenance can be achieved. In addition, the effectiveness of INSIDE is described by examples of its practical application.

  • Prefix Computations on Iterative Arrays with Sequential Input/Output Mode

    Chuzo IWAMOTO  Tomoka YOKOUCHI  Kenichi MORITA  Katsunobu IMAI  

     
    PAPER

      Vol:
    E87-D No:3
      Page(s):
    708-712

    This paper investigates prefix computations on Iterative Arrays (IAs) with sequential input/output mode. We show that, for any language L accepted by a linear-time IA, there is an IA which, given an infinite string a1a2 ai, generates the values of χL(a1),χL(a1a2),,χL(a1a2 ai), at steps 4,16,,4i2,, respectively. Here, χL:Σ*{0,1} is the characteristic function of the language L Σ*, defined as χL(w) = 1 iff w L. We also construct 2i3-time and i4-time prefix algorithms for languages accepted by quadratic-time and cubic-time IAs, respectively.

  • A 51.2 GOPS Programmable Video Recognition Processor for Vision-Based Intelligent Cruise Control Applications

    Shorin KYO  Takuya KOGA  Shin'ichiro OKAZAKI  Ichiro KURODA  

     
    PAPER-Processor

      Vol:
    E87-D No:1
      Page(s):
    136-145

    This paper describes a 51.2 GOPS video recognition processor that provides a cost effective device solution for vision-based intelligent cruise control (ICC) applications. By integrating 128 4-way VLIW (Very Low Instruction Word) processing elements and operating at 100 MHz, the processor achieves to provide a computation power enough for a weather robust lane mark and vehicle detection function written in a high level programming language, to run in video rate, while at the same time it satisfies power efficiency requirements of an in-vehicle LSI. Basing on four basic parallel methods and a software environment including an optimizing compiler of an extended C language and video-based GUI tools, efficient development of real-time video recognition applications that effectively utilize the 128 processing elements are facilitated. Benchmark results show that, this processor can provide a four times better performance compared with a 2.4 GHz general purpose micro-processor.

  • Verification of Synchronization in SpecC Description with the Use of Difference Decision Diagrams

    Thanyapat SAKUNKONCHAK  Satoshi KOMATSU  Masahiro FUJITA  

     
    PAPER-Logic and High Level Synthesis

      Vol:
    E86-A No:12
      Page(s):
    3192-3199

    SpecC language is designated to handle the design of entire system from specification to implementation and of hardware/software co-design. Concurrency is one of the features of SpecC which expresses the parallel execution of processes. Describing the systems which contain concurrent behaviors would have some data exchanging or transferring among them. Therefore, the synchronization semantics (notify/wait) of events should be incorporated. The actual design, which is usually sophisticated by its characteristic and functionalities, may contain a bunch of event synchronization codes. This will make the design difficult and time-consuming to verify. In this paper, we introduce a technique which helps verifying the synchronization of events in SpecC. The original SpecC code containing synchronization semantics is parsed and translated into a Boolean SpecC code. The difference decision diagrams (DDDs) is used to verify for event synchronization on Boolean SpecC code. The counter examples for tracing back to the original source are given when the verification results turn out to be unsatisfied. Here we also introduce idea on automatically refinement when the results are unsatisfied and preset some preliminary results.

  • Determining Indexing Strings with Statistical Analysis

    Yoshiyuki TAKEDA  Kyoji UMEMURA  Eiko YAMAMOTO  

     
    PAPER

      Vol:
    E86-D No:9
      Page(s):
    1781-1787

    Determining indexing strings is an important factor in information retrieval. Ideally, the strings should be words that represent documents or queries. Although any single word may be the first candidate for indexing strings for an English corpus, it may not be ideal due to the existence of compound nouns, which are often good indexing strings, and which often depend on the genre of the corpus used. The situation is even worse in Japanese or Chinese where the words are not separated by spaces. In this paper, we propose a method of determining indexing strings based on statistical analysis. The novel features of our method are to make the most of the statistical measure called "adaptation" and not to use language-dependent resources such as dictionaries and stop word lists. In evaluating our method using a Japanese test collection, we found that it actually improves the precision of information retrieval systems.

  • Relational Interface for Natural Language-Based Information Sources

    Zenshiro KAWASAKI  Keiji SHIBATA  Masato TAJIMA  

     
    LETTER-Databases

      Vol:
    E86-D No:6
      Page(s):
    1139-1143

    This paper presents an extension of the database query language SQL to include queries against a database with natural language annotations. The proposed scheme is based on Concept Coupling Model, a language model for handling natural language sentence structures. Integration of the language model with the conventional relational data model provides a unified environment for manipulating information sources comprised of relational tables and natural language texts.

  • On Automatic Speech Recognition at the Dawn of the 21st Century

    Chin-Hui LEE  

     
    INVITED SURVEY PAPER

      Vol:
    E86-D No:3
      Page(s):
    377-396

    In the last three decades of the 20th Century, research in speech recognition has been intensively carried out worldwide, spurred on by advances in signal processing, algorithms, architectures, and hardware. Recognition systems have been developed for a wide variety of applications, ranging from small vocabulary keyword recognition over dial-up telephone lines, to medium size vocabulary voice interactive command and control systems for business automation, to large vocabulary speech dictation, spontaneous speech understanding, and limited-domain speech translation. Although we have witnessed many new technological promises, we have also encountered a number of practical limitations that hinder a widespread deployment of applications and services. On one hand, fast progress was observed in statistical speech and language modeling. On the other hand only spotty successes have been reported in applying knowledge sources in acoustics, speech and language science to improving speech recognition performance and robustness to adverse conditions. In this paper we review some key advances in several areas of speech recognition. A bottom-up detection framework is also proposed to facilitate worldwide research collaboration for incorporating technology advances in both statistical modeling and knowledge integration into going beyond the current speech recognition limitations and benefiting the society in the 21st century.

141-160hit(282hit)