The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] translation(93hit)

61-80hit(93hit)

  • A Method for Reinforcing Noun Countability Prediction

    Ryo NAGATA  Atsuo KAWAI  Koichiro MORIHIRO  Naoki ISU  

     
    PAPER-Natural Language Processing

      Vol:
    E90-D No:12
      Page(s):
    2077-2086

    This paper proposes a method for reinforcing noun countability prediction, which plays a crucial role in demarcating correct determiners in machine translation and error detection. The proposed method reinforces countability prediction by introducing a novel heuristics called one countability per discourse. It claims that when a noun appears more than once in a discourse, all instances will share identical countability. The basic idea of the proposed method is that mispredictions can be corrected by efficiently using one countability per discourse heuristics. Experiments show that the proposed method successfully reinforces countability prediction and outperforms other methods used for comparison. In addition to its performance, it has two advantages over earlier methods: (i) it is applicable to any countability prediction method, and (ii) it requires no human intervention to reinforce countability prediction.

  • A MATLAB-Based Code Generator for Parallel Sparse Matrix Computations Utilizing PSBLAS

    Taiji SASAOKA  Hideyuki KAWABATA  Toshiaki KITAMURA  

     
    PAPER-Parallel Programming

      Vol:
    E90-D No:1
      Page(s):
    2-12

    Parallel programs for distributed memory machines are not easy to create and maintain, especially when they involve sparse matrix computations. In this paper, we propose a program translation system for generating parallel sparse matrix computation codes utilizing PSBLAS. The purpose of the development of the system is to offer the user a convenient way to construct parallel sparse code based on PSBLAS. The system is build up on the idea of bridging the gap between the easy-to-read program representations and highly-tuned parallel executables based on existing parallel sparse matrix computation libraries. The system accepts a MATLAB program with annotations and generates subroutines for an SPMD-style parallel program which runs on distributed-memory machines. Experimental results on parallel machines show that the prototype of our system can generate fairly efficient PSBLAS codes for simple applications such as CG and Bi-CGSTAB programs.

  • An Alignment Model for Extracting English-Korean Translations of Term Constituents

    Jong-Hoon OH  Key-Sun CHOI  Hitoshi ISAHARA  

     
    PAPER-Natural Language Processing

      Vol:
    E89-D No:12
      Page(s):
    2972-2980

    Technical terms are linguistic representations of a domain concept, and their constituents are components used to represent the concept. Technical terms are usually multi-word terms and their meanings can be inferred from their constituents. Therefore, term constituents are essential for understanding the designated meaning of technical terms. However, there are several problems in finding the correct meanings of technical terms with their term constituents. First, because a term constituent is usually a morphological unit rather than a conceptual unit in the case of Korean technical terms, we need to first identify conceptual units by chunking term constituents. Second, conceptual units are sometimes homonyms or synonyms. Moreover their meanings show domain dependency. It is therefore necessary to give information about conceptual units and their possible meanings, including homonyms, synonyms, and domain dependency, so that natural language applications can properly handle technical terms. In this paper, we propose a term constituent alignment algorithm that extracts such information from bilingual technical term pairs. Our algorithm recognizes conceptual units and their meanings by finding English term constituents and their corresponding Korean term constituents for given English-Korean term pairs. Our experimental results indicate that this method can effectively find conceptual units and their meanings with about 6% alignment error rate (AER) on manually analyzed experimental data and about 14% AER on automatically analyzed experimental data.

  • Multilingual Closed Caption Translation System for Digital Television

    Sanghwa YUH  Kongjoo LEE  Jungyun SEO  

     
    PAPER-Service and System

      Vol:
    E89-D No:6
      Page(s):
    1885-1892

    In this paper, we present a Korean to Chinese/English/Japanese multilingual Machine Translation (MT) system of closed captions for Digital Television (DTV). Preliminary experiments of our closed caption translation with existing base MT systems had shown unsatisfactory result. In order to achieve more accurate translation with the base MT systems, we adopted live resources of multilingual Named Entities and their translingual equivalences from the Web. We also utilize the program information, which the terrestrial broadcasters offer through DTV transport stream, in order to use program specific dictionaries, including the names of characters, locations and organizations. Two more components are adopted for reducing the ambiguities of parsing and word sense disambiguation; sentence simplification for long sentence segmentation and dynamic domain identification for automatic domain dictionary stacking. With these integrated approaches, we could raise the Mean Opinion Score (MOS) of translation accuracy by 0.40 higher than the base MT systems.

  • A Metric for Example Matching in Example-Based Machine Translation

    Dong-Joo KIM  Han-Woo KIM  

     
    LETTER

      Vol:
    E89-A No:6
      Page(s):
    1713-1716

    This paper proposes a metric for example matching under the example-based machine translation. Our metric served as similarity measure is employed to retrieve the most similar examples to a given query. Basically it makes use of simple information such as lemma and part-of-speech information of typographically mismatched words. In addition, it uses the contiguity information of matched word units to catch the full context. Finally we show the results for the correctness of the proposed metric.

  • A Method for English-Korean Target Word Selection Using Multiple Knowledge Sources

    Ki-Young LEE  Sang-Kyu PARK  Han-Woo KIM  

     
    PAPER

      Vol:
    E89-A No:6
      Page(s):
    1622-1629

    Target word selection is one of the most important and difficult tasks in English-Korean Machine Translation. It effects on the overall translation accuracy of machine translation systems. In this paper, we present a new approach to Korean target word selection for an English noun with translation ambiguities using multiple knowledge such as verb frame patterns, sense vectors based on collocations, statistical Korean local context information and co-occurring POS information. Verb frame patterns constructed with dictionary and corpus play an important role in resolving the sparseness problem of collocation data. Sense vectors are a set of collocation data when an English word having target selection ambiguities is to be translated to specific Korean target word. Statistical Korean Local Context Information is an N-gram information generated using Korean corpus. The co-occurring POS information is a statistically significant POS clue which appears with ambiguous word. To evaluate our approach, we applied the method to Tellus-EK system, English-Korean automatic translation system currently developed at ETRI [1],[2]. The experiment showed promising results for diverse sentences from web documents.

  • The Performance Analysis of NAT-PT and DSTM for IPv6 Dominant Network Deployment

    Myung-Ki SHIN  

     
    LETTER-Internet

      Vol:
    E88-B No:12
      Page(s):
    4664-4666

    NAT-PT and DSTM are becoming more widespread as de-facto standards for IPv6 dominant network deployment. But few researchers have empirically evaluated their performance aspects. In this paper, we compared the performance of NAT-PT and DSTM with IPv4-only and IPv6-only networks on user applications using metrics such as throughput, CPU utilization, round-trip time, and connect/request/response transaction rate.

  • Machine Learning Based English-to-Korean Transliteration Using Grapheme and Phoneme Information

    Jong-Hoon OH  Key-Sun CHOI  

     
    PAPER-Natural Language Processing

      Vol:
    E88-D No:7
      Page(s):
    1737-1748

    Machine transliteration is an automatic method to generate characters or words in one alphabetical system for the corresponding characters in another alphabetical system. Machine transliteration can play an important role in natural language application such as information retrieval and machine translation, especially for handling proper nouns and technical terms. The previous works focus on either a grapheme-based or phoneme-based method. However, transliteration is an orthographical and phonetic converting process. Therefore, both grapheme and phoneme information should be considered in machine transliteration. In this paper, we propose a grapheme and phoneme-based transliteration model and compare it with previous grapheme-based and phoneme-based models using several machine learning techniques. Our method shows about 1378% performance improvement.

  • Splitting Input for Machine Translation Using N-gram Language Model Together with Utterance Similarity

    Takao DOI  Eiichiro SUMITA  

     
    PAPER-Natural Language Processing

      Vol:
    E88-D No:6
      Page(s):
    1256-1264

    In order to boost the translation quality of corpus-based MT systems for speech translation, the technique of splitting an input utterance appears promising. In previous research, many methods used word-sequence characteristics like N-gram clues among splitting positions. In this paper, to supplement splitting methods based on word-sequence characteristics, we introduce another clue using similarity based on edit-distance. In our splitting method, we generate candidates for utterance splitting based on N-grams, and select the best one by measuring the utterance similarity against a corpus. This selection is founded on the assumption that a corpus-based MT system can correctly translate an utterance that is similar to an utterance in its training corpus. We conducted experiments using three MT systems: two EBMT systems, one of which uses a phrase as a translation unit and the other of which uses an utterance, and an SMT system. The translation results under various conditions were evaluated by objective measures and a subjective measure. The experimental results demonstrate that the proposed method is valuable for the three systems. Using utterance similarity can improve the translation quality.

  • An Objective Method for Evaluating Speech Translation System: Using a Second Language Learner's Corpus

    Keiji YASUDA  Fumiaki SUGAYA  Toshiyuki TAKEZAWA  Genichiro KIKUI  Seiichi YAMAMOTO  Masuzo YANAGIDA  

     
    PAPER-Speech Corpora and Related Topics

      Vol:
    E88-D No:3
      Page(s):
    569-577

    In this paper we propose an objective method for assessing the capability of a speech translation system. It automates the translation paired comparison method, which gives a simple, easy to understand TOEIC score proposed by Sugaya et al., to succinctly evaluate a speech translation system. To avoid the expensive evaluation cost of the original method where large manual effort is required, the new objective method automates the procedure by employing an objective metric such as BLEU and DP-based measure. The evaluation results obtained by the proposed method are similar to those of the original method. Also, the proposed method is used to evaluate the usefulness of a speech translation system. It is then found that our speech translation system is useful in general, even to users with higher TOEIC score than the system's.

  • Efficient Coding Translation of GSM and G.729 Speech Coders across Mobile and IP Networks

    Shu-Min TSAI  Jia-Ching WANG  Jar-Ferr YANG  Jhing-Fa WANG  

     
    PAPER-Speech and Hearing

      Vol:
    E87-D No:2
      Page(s):
    444-452

    In this paper, we propose a speech coding translation scheme by transferring coding parameters between GSM half rate and G.729 coders. Compared to the conventional decode-then-encode (DTE) scheme, the proposed parameter conversions provide speech interoperability between mobile and IP networks with reducing computational complexity and coding delay. Simulation results show that the proposed methods can reduce about 30% computational load and coding delay acquired in the target encoders and achieve almost imperceptible degradation in performance.

  • Software TLB Management for Embedded Systems

    Yukikazu NAKAMOTO  

     
    LETTER

      Vol:
    E86-D No:10
      Page(s):
    2034-2039

    The virtual memory functions in real-time operating systems have been used in embedded systems. Recent RISC processors provide virtual memory supports through software-managed Translation Lookaside Buffer (TLB) in software. In real-time aspects of the embedded systems, managing TLB entries is the most important because overhead at TLB miss time gives a great effect to overall performance of the system. In this paper, we propose several TLB management algorithms in MIPS processors. In the algorithms, a replaced TLB entry is randomly chosen or managed. We analyze the algorithms by comparing overheads at task switching times and TLB miss times.

  • An Automatic Interface Insertion Scheme for In-System Verification of Algorithm Models in C

    Chang-Jae PARK  Ando KI  In-Cheol PARK  Chong-Min KYUNG  

     
    PAPER-High Level Synthesis

      Vol:
    E85-A No:12
      Page(s):
    2645-2654

    This paper describes an automatic interface insertion scheme for in-system verification of algorithm models. To insert the interface, an algorithm model described in C is translated into another source code that includes the communication with hardware components in the target system to be validated with the algorithm model. The communication between the algorithm model and hardware components is achieved using transactors that perform transformation between access operations and bus cycle transactions. I/O terminal is introduced as an interface model to relate the transactions to access operations during the execution of the algorithm model, i.e., accesses to I/O terminals invoke bus cycle transactions in hardware and vice versa. An automatic interface insertion tool is developed using the source-to-source translation to identify the I/O terminals and insert interface function calls in the source code. The proposed automatic interface insertion scheme is validated by emulating several multimedia algorithms written in C on real target systems.

  • Disambiguating Word Senses in Korean-Japanese Machine Translation by Using Semi-Automatically Constructed Ontology

    Sin-Jae KANG  You-Jin CHUNG  Jong-Hyeok LEE  

     
    PAPER-Natural Language Processing

      Vol:
    E85-D No:10
      Page(s):
    1688-1697

    This paper presents a method for disambiguating word senses in Korean-Japanese machine translation by using a language independent ontology. This ontology stores semantic constraints between concepts and other world knowledge, and enables a natural language processing system to resolve semantic ambiguities by making inferences with the concept network of the ontology. In order to acquire a language-independent and reasonably practical ontology in a limited time and with less manpower, we extend the existing Kadokawa thesaurus by inserting additional semantic relations into its hierarchy, which are classified as case relations and other semantic relations. The former can be obtained by converting valency information and case frames from previously-built electronic dictionaries used in machine translation. The latter can be acquired from concept co-occurrence information, which is extracted automatically from a corpus. In practical machine translation systems, our word sense disambiguation method achieved an improvement of average precision by 6.0% for Japanese analysis and by 9.2% for Korean analysis over the method without using an ontology.

  • A Comparison of Bottom-Up Pushdown Tree Transducers and Top-Down Pushdown Tree Transducers

    Katsunori YAMASAKI  Yoshichika SODESHIMA  

     
    PAPER-Theory of Automata, Formal Language Theory

      Vol:
    E85-D No:5
      Page(s):
    799-811

    In this paper we introduce a bottom-up pushdown tree transducer (b-PDTT) which is a bottom-up tree transducer with pushdown storage (where the pushdown storage stores the trees) and may be considered as a dual concept of the top-down pushdown tree transducer (t-PDTT). After proving some fundamental properties of b-PDTT, for example, any b-PDTT can be realized by a linear stack with single state and converted into G-type normal form which corresponds to Greibach normal form in a context-free grammar, and so on, we compare the translational capability of a b-PDTT with that of a t-PDTT.

  • A Speech Translation System Applied to a Real-World Task/Domain and Its Evaluation Using Real-World Speech Data

    Atsushi NAKAMURA  Masaki NAITO  Hajime TSUKADA  Rainer GRUHN  Eiichiro SUMITA  Hideki KASHIOKA  Hideharu NAKAJIMA  Tohru SHIMIZU  Yoshinori SAGISAKA  

     
    PAPER-Speech and Hearing

      Vol:
    E84-D No:1
      Page(s):
    142-154

    This paper describes an application of a speech translation system to another task/domain in the real-world by using developmental data collected from real-world interactions. The total cost for this task-alteration was calculated to be 9 Person-Month. The newly applied system was also evaluated by using speech data collected from real-world interactions. For real-world speech having a machine-friendly speaking style, the newly applied system could recognize typical sentences with a word accuracy of 90% or better. We also found that, concerning the overall speech translation performance, the system could translate about 80% of the input Japanese speech into acceptable English sentences.

  • Some Notes on Domain Tree Languages of Top-Down Pushdown Tree Transducers

    Katsunori YAMASAKI  

     
    PAPER-Theory of Automata, Formal Language Theory

      Vol:
    E83-D No:9
      Page(s):
    1713-1720

    In this paper, some properties of domain tree languages of top-down pushdown tree transducers (domain(t-PDTT) or t-PDTTD) are shown. It is shown that (1) for any L1, L2 in context-free language (CFL), L1L2yielde(t-PDTTD) (where yielde is an extended yield), (2) yielde(t-PDTTε0DF) is closed under homomorphisms, where t-PDTTε0 is a t-PDTT which can not proceed generations after reading a constant symbol σ and t-PDTTε0DF denotes a domain tree language of t-PDTTε0 with a final state translation, and (3) yielde(t-PDTTε0DF) is the class of recursively enumerable languages, and consequently yielde(t-PDTTD) is the class of recursively enumerable languages.

  • Preliminary Study on a Sign-Language Chatting System between Korea and Japan for Avatar Communication on the Internet

    Sang-Woon KIM  Ji-Young OH  Shin TANAHASHI  Yoshinao AOKI  

     
    LETTER-Human Communications

      Vol:
    E83-A No:2
      Page(s):
    386-389

    In order to investigate the possibility of avatar communication using sign-language, in this paper, we develop a sign-language chatting system on the Internet using CG aniamtion techniques between Korea and Japan. We construct the system in server-client architecture, where images of Korean or Japanese sign-language are analyzed into a series of parameters for sign-language animation by server. We transmit the parameters, which are text data instead of images or their compression, to clients and regenerate the corresponding CG animation using the received data. The chatting system is implemented with Visual C++ 5.0 on Windows platforms. Experimental results show that the sign-language could be used as a communication means between avatars of different languages.

  • Incremental Transfer in English-Japanese Machine Translation

    Shigeki MATSUBARA  Yasuyoshi INAGAKI  

     
    PAPER-Artificial Intelligence and Cognitive Science

      Vol:
    E80-D No:11
      Page(s):
    1122-1130

    Since spontaneously spoken language expressions appear continuously, the transfer stage of a spoken language machine translation system have to work incrementally. In such the system, the high degree of incrementality is also strongly required rather than that of quality. This paper proposes an incremental machine translation system, which translates English spoken words into Japanese in accordance with the order of appearances of them. The system is composed of three modules: incremental parsing, transfer and generation, which work synchronously. The transfer module utilizes some features and phenomena characterizing Japanese spoken language: flexible wordorder, ellipses, repetitions and so forth. This in influenced by the observational facts that such characteristics frequently appear in Japanese uttered by English-Japanese interpreters. Their frequent utilization is the key to success of the exceedingly incremental translation between English and Japanese, which have different word-order. We have implemented a prototype system Sync/Trans, which parses English dialogues incrementally and generates Japanese immediately. To evaluate Sync/Trans we fave made an experiment with the conversations consisting of 27 dialogues and 218 sentences. 190 of the sentences are correct, providing a success rate of 87.2%. This result shows our incremental method to be a promising technique for spoken language translation with acceptable accuracy and high real-time nature.

  • Non-deterministic Constraint Generation for Analog and Mixed-Signal Layout

    Edoardo CHARBON  Enrico MALAVASI  Paolo MILIOZZI  Alberto SANGIOVANNI-VINCENTELLI  

     
    PAPER-Physical Design

      Vol:
    E80-D No:10
      Page(s):
    1032-1043

    In this paper we propose a comprehensive approach to physical design based on the constraint paradigm. Bounds on the most critical circuit parasitics are automatically generated to help designers and/or physical design tools meet a set of high-level specifications. The constraint generation engine is based on constrained optimization, where various parasitic effects on interconnect and devices are accounted for and dealt with in different manners according to their statistical behavior and their effect on performance.

61-80hit(93hit)