The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] Japanese(26hit)

1-20hit(26hit)

  • Sense-Aware Decoder for Character Based Japanese-Chinese NMT Open Access

    Zezhong LI  Fuji REN  

     
    LETTER-Natural Language Processing

      Pubricized:
    2023/12/11
      Vol:
    E107-D No:4
      Page(s):
    584-587

    Compared to subword based Neural Machine Translation (NMT), character based NMT eschews linguistic-motivated segmentation which performs directly on the raw character sequence, following a more absolute end-to-end manner. This property is more fascinating for machine translation (MT) between Japanese and Chinese, both of which use consecutive logographic characters without explicit word boundaries. However, there is still one disadvantage which should be addressed, that is, character is a less meaning-bearing unit than the subword, which requires the character models to be capable of sense discrimination. Specifically, there are two types of sense ambiguities existing in the source and target language, separately. With the former, it has been partially solved by the deep encoder and several existing works. But with the later, interestingly, the ambiguity in the target side is rarely discussed. To address this problem, we propose two simple yet effective methods, including a non-parametric pre-clustering for sense induction and a joint model to perform sense discrimination and NMT training simultaneously. Extensive experiments on Japanese⟷Chinese MT show that our proposed methods consistently outperform the strong baselines, and verify the effectiveness of using sense-discriminated representation for character based NMT.

  • A Simple and Interactive System for Modeling Typical Japanese Castles

    Shogo UMEYAMA  Yoshinori DOBASHI  

     
    LETTER-Computer Graphics

      Pubricized:
    2022/11/08
      Vol:
    E106-D No:2
      Page(s):
    267-270

    We present an interactive modeling system for Japanese castles. We develop an user interface that can generate the fundamental structure of the castle tower consisting of stone walls, turrets, and roofs. By clicking on the screen displaying the 3D space with the mouse, relevant parameters are calculated automatically to generate 3D models of Japanese-style castles. We use characteristic curves that often appear in ancient Japanese architecture for the realistic modeling of the castles. We evaluate the effectiveness of our method by comparing the castle generated by our method with a commercially-available 3D mode of a castle.

  • Synthetic Scene Character Generator and Ensemble Scheme with the Random Image Feature Method for Japanese and Chinese Scene Character Recognition

    Fuma HORIE  Hideaki GOTO  Takuo SUGANUMA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/08/24
      Vol:
    E104-D No:11
      Page(s):
    2002-2010

    Scene character recognition has been intensively investigated for a couple of decades because it has a great potential in many applications including automatic translation, signboard recognition, and reading assistance for the visually-impaired. However, scene characters are difficult to recognize at sufficient accuracy owing to various noise and image distortions. In addition, Japanese scene character recognition is more challenging and requires a large amount of character data for training because thousands of character classes exist in the language. Some researchers proposed training data augmentation techniques using Synthetic Scene Character Data (SSCD) to compensate for the shortage of training data. In this paper, we propose a Random Filter which is a new method for SSCD generation, and introduce an ensemble scheme with the Random Image Feature (RI-Feature) method. Since there has not been a large Japanese scene character dataset for the evaluation of the recognition systems, we have developed an open dataset JPSC1400, which consists of a large number of real Japanese scene characters. It is shown that the accuracy has been improved from 70.9% to 83.1% by introducing the RI-Feature method to the ensemble scheme.

  • Prosody Correction Preserving Speaker Individuality for Chinese-Accented Japanese HMM-Based Text-to-Speech Synthesis Open Access

    Daiki SEKIZAWA  Shinnosuke TAKAMICHI  Hiroshi SARUWATARI  

     
    LETTER-Speech and Hearing

      Pubricized:
    2019/03/11
      Vol:
    E102-D No:6
      Page(s):
    1218-1221

    This article proposes a prosody correction method based on partial model adaptation for Chinese-accented Japanese hidden Markov model (HMM)-based text-to-speech synthesis. Although text-to-speech synthesis built from non-native speech accurately reproduces the speaker's individuality in synthetic speech, the naturalness of the synthetic speech is strongly degraded. In the proposed model, to improve the naturalness while preserving the speaker individuality of Chinese-accented Japanese text-to-speech synthesis, we partially utilize HMM parameters of native Japanese speech to synthesize prosody-corrected synthetic speech. Results of an experimental evaluation demonstrate that duration and F0 correction are significantly effective for improving naturalness.

  • An Approach for Chinese-Japanese Named Entity Equivalents Extraction Using Inductive Learning and Hanzi-Kanji Mapping Table

    JinAn XU  Yufeng CHEN  Kuang RU  Yujie ZHANG  Kenji ARAKI  

     
    PAPER-Natural Language Processing

      Pubricized:
    2017/05/02
      Vol:
    E100-D No:8
      Page(s):
    1882-1892

    Named Entity Translation Equivalents extraction plays a critical role in machine translation (MT) and cross language information retrieval (CLIR). Traditional methods are often based on large-scale parallel or comparable corpora. However, the applicability of these studies is constrained, mainly because of the scarcity of parallel corpora of the required scale, especially for language pairs of Chinese and Japanese. In this paper, we propose a method considering the characteristics of Chinese and Japanese to automatically extract the Chinese-Japanese Named Entity (NE) translation equivalents based on inductive learning (IL) from monolingual corpora. The method adopts the Chinese Hanzi and Japanese Kanji Mapping Table (HKMT) to calculate the similarity of the NE instances between Japanese and Chinese. Then, we use IL to obtain partial translation rules for NEs by extracting the different parts from high similarity NE instances in Chinese and Japanese. In the end, the feedback processing updates the Chinese and Japanese NE entity similarity and rule sets. Experimental results show that our simple, efficient method, which overcomes the insufficiency of the traditional methods, which are severely dependent on bilingual resource. Compared with other methods, our method combines the language features of Chinese and Japanese with IL for automatically extracting NE pairs. Our use of a weak correlation bilingual text sets and minimal additional knowledge to extract NE pairs effectively reduces the cost of building the corpus and the need for additional knowledge. Our method may help to build a large-scale Chinese-Japanese NE translation dictionary using monolingual corpora.

  • Accent Sandhi Estimation of Tokyo Dialect of Japanese Using Conditional Random Fields Open Access

    Masayuki SUZUKI  Ryo KUROIWA  Keisuke INNAMI  Shumpei KOBAYASHI  Shinya SHIMIZU  Nobuaki MINEMATSU  Keikichi HIROSE  

     
    INVITED PAPER

      Pubricized:
    2016/12/08
      Vol:
    E100-D No:4
      Page(s):
    655-661

    When synthesizing speech from Japanese text, correct assignment of accent nuclei for input text with arbitrary contents is indispensable in obtaining naturally-sounding synthetic speech. A phenomenon called accent sandhi occurs in utterances of Japanese; when a word is uttered in a sentence, its accent nucleus may change depending on the contexts of preceding/succeeding words. This paper describes a statistical method for automatically predicting the accent nucleus changes due to accent sandhi. First, as the basis of the research, a database of Japanese text was constructed with labels of accent phrase boundaries and accent nucleus positions when uttered in sentences. A single native speaker of Tokyo dialect Japanese annotated all the labels for 6,344 Japanese sentences. Then, using this database, a conditional-random-field-based method was developed using this database to predict accent phrase boundaries and accent nuclei. The proposed method predicted accent nucleus positions for accent phrases with 94.66% accuracy, clearly surpassing the 87.48% accuracy obtained using our rule-based method. A listening experiment was also conducted on synthetic speech obtained using the proposed method and that obtained using the rule-based method. The results show that our method significantly improved the naturalness of synthetic speech.

  • Development and Evaluation of Online Infrastructure to Aid Teaching and Learning of Japanese Prosody Open Access

    Nobuaki MINEMATSU  Ibuki NAKAMURA  Masayuki SUZUKI  Hiroko HIRANO  Chieko NAKAGAWA  Noriko NAKAMURA  Yukinori TAGAWA  Keikichi HIROSE  Hiroya HASHIMOTO  

     
    INVITED PAPER

      Pubricized:
    2016/12/22
      Vol:
    E100-D No:4
      Page(s):
    662-669

    This paper develops an online and freely available framework to aid teaching and learning the prosodic control of Tokyo Japanese: how to generate its adequate word accent and phrase intonation. This framework is called OJAD (Online Japanese Accent Dictionary) [1] and it provides three features. 1) Visual, auditory, systematic, and comprehensive illustration of patterns of accent change (accent sandhi) of verbs and adjectives. Here only the changes caused by twelve fundamental conjugations are focused upon. 2) Visual illustration of the accent pattern of a given verbal expression, which is a combination of a verb and its postpositional auxiliary words. 3) Visual illustration of the pitch pattern of any given sentence and the expected positions of accent nuclei in the sentence. The third feature is technically implemented by using an accent change prediction module that we developed for Japanese Text-To-Speech (TTS) synthesis [2],[3]. Experiments show that accent nucleus assignment to given texts by the proposed framework is much more accurate than that by native speakers. Subjective assessment and objective assessment done by teachers and learners show extremely high pedagogical effectiveness of the developed framework.

  • Non-Native Text-to-Speech Preserving Speaker Individuality Based on Partial Correction of Prosodic and Phonetic Characteristics

    Yuji OSHIMA  Shinnosuke TAKAMICHI  Tomoki TODA  Graham NEUBIG  Sakriani SAKTI  Satoshi NAKAMURA  

     
    PAPER-Speech and Hearing

      Pubricized:
    2016/08/30
      Vol:
    E99-D No:12
      Page(s):
    3132-3139

    This paper presents a novel non-native speech synthesis technique that preserves the individuality of a non-native speaker. Cross-lingual speech synthesis based on voice conversion or Hidden Markov Model (HMM)-based speech synthesis is a technique to synthesize foreign language speech using a target speaker's natural speech uttered in his/her mother tongue. Although the technique holds promise to improve a wide variety of applications, it tends to cause degradation of target speaker's individuality in synthetic speech compared to intra-lingual speech synthesis. This paper proposes a new approach to speech synthesis that preserves speaker individuality by using non-native speech spoken by the target speaker. Although the use of non-native speech makes it possible to preserve the speaker individuality in the synthesized target speech, naturalness is significantly degraded as the synthesized speech waveform is directly affected by unnatural prosody and pronunciation often caused by differences in the linguistic systems of the source and target languages. To improve naturalness while preserving speaker individuality, we propose (1) a prosody correction method based on model adaptation, and (2) a phonetic correction method based on spectrum replacement for unvoiced consonants. The experimental results using English speech uttered by native Japanese speakers demonstrate that (1) the proposed methods are capable of significantly improving naturalness while preserving the speaker individuality in synthetic speech, and (2) the proposed methods also improve intelligibility as confirmed by a dictation test.

  • Katakana EdgeWrite: An EdgeWrite Version for Japanese Text Entry

    Kentaro GO  Yuichiro KINOSHITA  

     
    LETTER-Interaction

      Vol:
    E97-D No:8
      Page(s):
    2053-2054

    This paper presents our project of designing EdgeWrite text entry methods for Japanese language. We are developing a version of EdgeWrite text entry method for Japanese language: Katakana EdgeWrite. Katakana EdgeWrite specifies the line stroke directions and writing order of the Japanese Katakana character. The ideal corner sequence pattern of EdgeWrite for each Katakana character is designed based on its line stroke directions and writing order.

  • A Study on Pitch Patterns in Japanese Speakers of English with Verification by Speech Re-Synthesis

    Tomoko NARIAI  Kazuyo TANAKA  

     
    PAPER-Speech and Hearing

      Vol:
    E94-D No:12
      Page(s):
    2495-2502

    Certain irregularities in the utterances of words or phrases often occur in English spoken by Japanese native subject, referred to in this article as Japanese English. Japanese English is linguistically presumed to reflect the phonetic characteristics of Japanese. We consider the prosodic feature patterns as one of the most common causes of irregularities in Japanese English, and that Japanese English would have better prosodic patterns if its particular characteristics were modified. This study investigates prosodic differences between Japanese English and English speakers' English, and shows the quantitative results of a statistical analysis of pitch. The analysis leads to rules that show how to modify Japanese English to have pitch patterns closer to those of English speakers. On the basis of these rules, the pitch patterns of test speech samples of Japanese English are modified, and then re-synthesized. The modified speech is evaluated in a listening experiment by native English subjects. The result of the experiment shows that on average, over three-fold of the English subjects support the proposed modification against original speech. Therefore, the results of the experiments indicate practical verification of validity of the rules. Additionally, the results suggest that irregularities of prominence lie in Japanese English sentences. This can be explained by the prosodic transfer of first language prosodic characteristics on second language prosodic patterns.

  • Evaluation of SAR and Temperature Elevation Using Japanese Anatomical Human Models for Body-Worn Devices

    Teruo ONISHI  Takahiro IYAMA  Lira HAMADA  Soichi WATANABE  Akimasa HIRATA  

     
    LETTER-Electromagnetic Compatibility(EMC)

      Vol:
    E93-B No:12
      Page(s):
    3643-3646

    This paper investigates the relationship between averaged SAR (Specific Absorption Rate) over 10 g mass and temperature elevation in Japanese numerical anatomical models when devices are mounted on the body. Simplifying the radiation source as a half-wavelength dipole, the generated electrical field and SAR are calculated using the FDTD (Finite-Difference Time-Domain) method. Then the bio-heat equation is solved to obtain the temperature elevation due to the SAR derived using the FDTD method as heat source. Frequencies used in the study are 900 MHz and 1950 MHz, which are used for mobile phones. In addition, 3500 MHz is considered because this frequency is reserved for IMT-Advanced (International Mobile Telecommunication-Advanced System). Computational results obtained herein show that the 10 g-average SAR and the temperature elevation are not proportional to frequency. In addition, it is clear that those at 3500 MHz are lower than that at 1950 MHz even though the frequency is higher. It is the point to be stressed here is that good correlation between the 10 g-average SAR and the temperature elevation is observed even for the body-worn device.

  • Estimation of Speech Intelligibility Using Speech Recognition Systems

    Yusuke TAKANO  Kazuhiro KONDO  

     
    PAPER-Speech and Hearing

      Vol:
    E93-D No:12
      Page(s):
    3368-3376

    We attempted to estimate subjective scores of the Japanese Diagnostic Rhyme Test (DRT), a two-to-one forced selection speech intelligibility test. We used automatic speech recognizers with language models that force one of the words in the word-pair, mimicking the human recognition process of the DRT. Initial testing was done using speaker-independent models, and they showed significantly lower scores than subjective scores. The acoustic models were then adapted to each of the speakers in the corpus, and then adapted to noise at a specified SNR. Three different types of noise were tested: white noise, multi-talker (babble) noise, and pseudo-speech noise. The match between subjective and estimated scores improved significantly with noise-adapted models compared to speaker-independent models and the speaker-adapted models, when the adapted noise level and the tested level match. However, when SNR conditions do not match, the recognition scores degraded especially when tested SNR conditions were higher than the adapted noise level. Accordingly, we adapted the models to mixed levels of noise, i.e., multi-condition training. The adapted models now showed relatively high intelligibility matching subjective intelligibility performance over all levels of noise. The correlation between subjective and estimated intelligibility scores increased to 0.94 with multi-talker noise, 0.93 with white noise, and 0.89 with pseudo-speech noise, while the root mean square error (RMSE) reduced from more than 40 to 13.10, 13.05 and 16.06, respectively.

  • Unsupervised Speaker Adaptation Using Speaker-Class Models for Lecture Speech Recognition

    Tetsuo KOSAKA  Yuui TAKEDA  Takashi ITO  Masaharu KATO  Masaki KOHDA  

     
    PAPER-Adaptation

      Vol:
    E93-D No:9
      Page(s):
    2363-2369

    In this paper, we propose a new speaker-class modeling and its adaptation method for the LVCSR system and evaluate the method on the Corpus of Spontaneous Japanese (CSJ). In this method, closer speakers are selected from training speakers and the acoustic models are trained by using their utterances for each evaluation speaker. One of the major issues of the speaker-class model is determining the selection range of speakers. In order to solve the problem, several models which have a variety of speaker range are prepared for each evaluation speaker in advance, and the most proper model is selected on a likelihood basis in the recognition step. In addition, we improved the recognition performance using unsupervised speaker adaptation with the speaker-class models. In the recognition experiments, a significant improvement could be obtained by using the proposed speaker adaptation based on speaker-class models compared with the conventional adaptation method.

  • A Method for Recognizing Noisy Romanized Japanese Words in Learner English

    Ryo NAGATA  Jun-ichi KAKEGAWA  Hiromi SUGIMOTO  Yukiko YABUTA  

     
    PAPER-Educational Technology

      Vol:
    E91-D No:10
      Page(s):
    2458-2466

    This paper describes a method for recognizing romanized Japanese words in learner English. They become noise and problematic in a variety of systems and tools for language learning and teaching including text analysis, spell checking, and grammatical error detection because they are Japanese words and thus mostly unknown to such systems and tools. A problem one encounters when recognizing romanized Japanese words in learner English is that the spelling rules of romanized Japanese words are often violated. To address this problem, the described method uses a clustering algorithm reinforced by a small set of rules. Experiments show that it achieves an F-measure of 0.879 and outperforms other methods. They also show that it only requires the target text and an English word list of reasonable size.

  • Dynamical Calling Behavior Experimentally Observed in Japanese Tree Frogs (Hyla japonica)

    Ikkyu AIHARA  Shunsuke HORAI  Hiroyuki KITAHATA  Kazuyuki AIHARA  Kenichi YOSHIKAWA  

     
    PAPER-Nonlinear Phenomena and Analysis

      Vol:
    E90-A No:10
      Page(s):
    2154-2161

    We recorded time series data of calls of Japanese tree frogs (Hyla japonica; Nihon-Ama-Gaeru) and examined the dynamics of the experimentally observed data not only through linear time series analysis such as power spectra but also through nonlinear time series analysis such as reconstruction of orbits with delay coordinates and different kinds of recurrence plots, namely the conventional recurrence plot (RP), the iso-directional recurrence plot (IDRP), and the iso-directional neighbors plot (IDNP). The results show that a single frog called nearly periodically, and a pair of frogs called nearly periodically but alternately in almost anti-phase synchronization with little overlap through mutual interaction. The fundamental frequency of the calls of a single frog during the interactive calling between two frogs was smaller than when the same frog first called alone. We also used the recurrence plots to study nonlinear and nonstationary determinism in the transition of the calling behavior. Moreover, we quantified the determinism of the nonlinear and nonstationary dynamics with indices of the ratio R of the number of points in IDNP to that in RP and the percentage PD of contiguous points forming diagonal lines in RP by the recurrence quantification analysis (RQA). Finally, we discuss a possibility of mathematical modeling of the calling behavior and a possible biological meaning of the call alternation.

  • A Statistical Model Based on the Three Head Words for Detecting Article Errors

    Ryo NAGATA  Tatsuya IGUCHI  Fumito MASUI  Atsuo KAWAI  Naoki ISU  

     
    PAPER-Educational Technology

      Vol:
    E88-D No:7
      Page(s):
    1700-1706

    In this paper, we propose a statistical model for detecting article errors, which Japanese learners of English often make in English writing. It is based on the three head words--the verb head, the preposition, and the noun head. To overcome the data sparseness problem, we apply the backed-off estimate to it. Experiments show that its performance (F-measure=0.70) is better than that of other methods. Apart from the performance, it has two advantages: (i) Rules for detecting article errors are automatically generated as conditional probabilities once a corpus is given; (ii) Its recall and precision rates are adjustable.

  • Gemination of Consonant in Spontaneous Speech: An Analysis of the "Corpus of Spontaneous Japanese"

    Masako FUJIMOTO  Takayuki KAGOMIYA  

     
    PAPER-Speech Corpora and Related Topics

      Vol:
    E88-D No:3
      Page(s):
    562-568

    In Japanese, there is frequent alternation between CV morae and moraic geminate consonants. In this study, we analyzed the phonemic environments of consonant gemination (CG) using the "Corpus of Spontaneous Japanese (CSJ)." The results revealed that the environment in which gemination occurs is, to some extent, parallel to that of vowel devoicing. However, there are two crucial differences. One difference is that the CG tends to occur in a /kVk/ environment, whereas such is not the case for vowel devoicing. The second difference is that when the preceding consonant is /r/, gemination occurs, but not vowel devoicing. These observations suggest that the mechanism leading to CG differs from that which leads to vowel devoicing.

  • Robust Dependency Parsing of Spontaneous Japanese Spoken Language

    Tomohiro OHNO  Shigeki MATSUBARA  Nobuo KAWAGUCHI  Yasuyoshi INAGAKI  

     
    PAPER-Speech Corpora and Related Topics

      Vol:
    E88-D No:3
      Page(s):
    545-552

    Spontaneously spoken Japanese includes a lot of grammatically ill-formed linguistic phenomena such as fillers, hesitations, inversions, and so on, which do not appear in written language. This paper proposes a novel method of robust dependency parsing using a large-scale spoken language corpus, and evaluates the availability and robustness of the method using spontaneously spoken dialogue sentences. By utilizing stochastic information about the appearance of ill-formed phenomena, the method can robustly parse spoken Japanese including fillers, inversions, or dependencies over utterance units. Experimental results reveal that the parsing accuracy reached 87.0%, and we confirmed that it is effective to utilize the location information of a bunsetsu, and the distance information between bunsetsus as stochastic information.

  • Information Extraction and Summarization for Newspaper Articles on Sassho-jiken

    Teiji FURUGORI  Rihua LIN  Takeshi ITO  Dongli HAN  

     
    PAPER

      Vol:
    E86-D No:9
      Page(s):
    1728-1735

    Described here is an automatic text summarization system for Japanese newspaper articles on sassho-jiken (murders and bodily harms). We extract the pieces of information from a text, inter-connect them to represent the scenes and participants involved in the sassho-jiken, and finally produce a summary by generating sentences from the information extracted. An experiment and its evaluation show that, while a limitation being imposed on the domain, our method works well in depicting important information from the newspaper articles and the summaries produced are better in adequacy and readability than those obtained by extracting sentences.

  • Incremental Transfer in English-Japanese Machine Translation

    Shigeki MATSUBARA  Yasuyoshi INAGAKI  

     
    PAPER-Artificial Intelligence and Cognitive Science

      Vol:
    E80-D No:11
      Page(s):
    1122-1130

    Since spontaneously spoken language expressions appear continuously, the transfer stage of a spoken language machine translation system have to work incrementally. In such the system, the high degree of incrementality is also strongly required rather than that of quality. This paper proposes an incremental machine translation system, which translates English spoken words into Japanese in accordance with the order of appearances of them. The system is composed of three modules: incremental parsing, transfer and generation, which work synchronously. The transfer module utilizes some features and phenomena characterizing Japanese spoken language: flexible wordorder, ellipses, repetitions and so forth. This in influenced by the observational facts that such characteristics frequently appear in Japanese uttered by English-Japanese interpreters. Their frequent utilization is the key to success of the exceedingly incremental translation between English and Japanese, which have different word-order. We have implemented a prototype system Sync/Trans, which parses English dialogues incrementally and generates Japanese immediately. To evaluate Sync/Trans we fave made an experiment with the conversations consisting of 27 dialogues and 218 sentences. 190 of the sentences are correct, providing a success rate of 87.2%. This result shows our incremental method to be a promising technique for spoken language translation with acceptable accuracy and high real-time nature.

1-20hit(26hit)