The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] Japan(36hit)

1-20hit(36hit)

  • Sense-Aware Decoder for Character Based Japanese-Chinese NMT Open Access

    Zezhong LI  Fuji REN  

     
    LETTER-Natural Language Processing

      Pubricized:
    2023/12/11
      Vol:
    E107-D No:4
      Page(s):
    584-587

    Compared to subword based Neural Machine Translation (NMT), character based NMT eschews linguistic-motivated segmentation which performs directly on the raw character sequence, following a more absolute end-to-end manner. This property is more fascinating for machine translation (MT) between Japanese and Chinese, both of which use consecutive logographic characters without explicit word boundaries. However, there is still one disadvantage which should be addressed, that is, character is a less meaning-bearing unit than the subword, which requires the character models to be capable of sense discrimination. Specifically, there are two types of sense ambiguities existing in the source and target language, separately. With the former, it has been partially solved by the deep encoder and several existing works. But with the later, interestingly, the ambiguity in the target side is rarely discussed. To address this problem, we propose two simple yet effective methods, including a non-parametric pre-clustering for sense induction and a joint model to perform sense discrimination and NMT training simultaneously. Extensive experiments on Japanese⟷Chinese MT show that our proposed methods consistently outperform the strong baselines, and verify the effectiveness of using sense-discriminated representation for character based NMT.

  • A Simple and Interactive System for Modeling Typical Japanese Castles

    Shogo UMEYAMA  Yoshinori DOBASHI  

     
    LETTER-Computer Graphics

      Pubricized:
    2022/11/08
      Vol:
    E106-D No:2
      Page(s):
    267-270

    We present an interactive modeling system for Japanese castles. We develop an user interface that can generate the fundamental structure of the castle tower consisting of stone walls, turrets, and roofs. By clicking on the screen displaying the 3D space with the mouse, relevant parameters are calculated automatically to generate 3D models of Japanese-style castles. We use characteristic curves that often appear in ancient Japanese architecture for the realistic modeling of the castles. We evaluate the effectiveness of our method by comparing the castle generated by our method with a commercially-available 3D mode of a castle.

  • Synthetic Scene Character Generator and Ensemble Scheme with the Random Image Feature Method for Japanese and Chinese Scene Character Recognition

    Fuma HORIE  Hideaki GOTO  Takuo SUGANUMA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/08/24
      Vol:
    E104-D No:11
      Page(s):
    2002-2010

    Scene character recognition has been intensively investigated for a couple of decades because it has a great potential in many applications including automatic translation, signboard recognition, and reading assistance for the visually-impaired. However, scene characters are difficult to recognize at sufficient accuracy owing to various noise and image distortions. In addition, Japanese scene character recognition is more challenging and requires a large amount of character data for training because thousands of character classes exist in the language. Some researchers proposed training data augmentation techniques using Synthetic Scene Character Data (SSCD) to compensate for the shortage of training data. In this paper, we propose a Random Filter which is a new method for SSCD generation, and introduce an ensemble scheme with the Random Image Feature (RI-Feature) method. Since there has not been a large Japanese scene character dataset for the evaluation of the recognition systems, we have developed an open dataset JPSC1400, which consists of a large number of real Japanese scene characters. It is shown that the accuracy has been improved from 70.9% to 83.1% by introducing the RI-Feature method to the ensemble scheme.

  • Prosody Correction Preserving Speaker Individuality for Chinese-Accented Japanese HMM-Based Text-to-Speech Synthesis Open Access

    Daiki SEKIZAWA  Shinnosuke TAKAMICHI  Hiroshi SARUWATARI  

     
    LETTER-Speech and Hearing

      Pubricized:
    2019/03/11
      Vol:
    E102-D No:6
      Page(s):
    1218-1221

    This article proposes a prosody correction method based on partial model adaptation for Chinese-accented Japanese hidden Markov model (HMM)-based text-to-speech synthesis. Although text-to-speech synthesis built from non-native speech accurately reproduces the speaker's individuality in synthetic speech, the naturalness of the synthetic speech is strongly degraded. In the proposed model, to improve the naturalness while preserving the speaker individuality of Chinese-accented Japanese text-to-speech synthesis, we partially utilize HMM parameters of native Japanese speech to synthesize prosody-corrected synthetic speech. Results of an experimental evaluation demonstrate that duration and F0 correction are significantly effective for improving naturalness.

  • An Approach for Chinese-Japanese Named Entity Equivalents Extraction Using Inductive Learning and Hanzi-Kanji Mapping Table

    JinAn XU  Yufeng CHEN  Kuang RU  Yujie ZHANG  Kenji ARAKI  

     
    PAPER-Natural Language Processing

      Pubricized:
    2017/05/02
      Vol:
    E100-D No:8
      Page(s):
    1882-1892

    Named Entity Translation Equivalents extraction plays a critical role in machine translation (MT) and cross language information retrieval (CLIR). Traditional methods are often based on large-scale parallel or comparable corpora. However, the applicability of these studies is constrained, mainly because of the scarcity of parallel corpora of the required scale, especially for language pairs of Chinese and Japanese. In this paper, we propose a method considering the characteristics of Chinese and Japanese to automatically extract the Chinese-Japanese Named Entity (NE) translation equivalents based on inductive learning (IL) from monolingual corpora. The method adopts the Chinese Hanzi and Japanese Kanji Mapping Table (HKMT) to calculate the similarity of the NE instances between Japanese and Chinese. Then, we use IL to obtain partial translation rules for NEs by extracting the different parts from high similarity NE instances in Chinese and Japanese. In the end, the feedback processing updates the Chinese and Japanese NE entity similarity and rule sets. Experimental results show that our simple, efficient method, which overcomes the insufficiency of the traditional methods, which are severely dependent on bilingual resource. Compared with other methods, our method combines the language features of Chinese and Japanese with IL for automatically extracting NE pairs. Our use of a weak correlation bilingual text sets and minimal additional knowledge to extract NE pairs effectively reduces the cost of building the corpus and the need for additional knowledge. Our method may help to build a large-scale Chinese-Japanese NE translation dictionary using monolingual corpora.

  • Accent Sandhi Estimation of Tokyo Dialect of Japanese Using Conditional Random Fields Open Access

    Masayuki SUZUKI  Ryo KUROIWA  Keisuke INNAMI  Shumpei KOBAYASHI  Shinya SHIMIZU  Nobuaki MINEMATSU  Keikichi HIROSE  

     
    INVITED PAPER

      Pubricized:
    2016/12/08
      Vol:
    E100-D No:4
      Page(s):
    655-661

    When synthesizing speech from Japanese text, correct assignment of accent nuclei for input text with arbitrary contents is indispensable in obtaining naturally-sounding synthetic speech. A phenomenon called accent sandhi occurs in utterances of Japanese; when a word is uttered in a sentence, its accent nucleus may change depending on the contexts of preceding/succeeding words. This paper describes a statistical method for automatically predicting the accent nucleus changes due to accent sandhi. First, as the basis of the research, a database of Japanese text was constructed with labels of accent phrase boundaries and accent nucleus positions when uttered in sentences. A single native speaker of Tokyo dialect Japanese annotated all the labels for 6,344 Japanese sentences. Then, using this database, a conditional-random-field-based method was developed using this database to predict accent phrase boundaries and accent nuclei. The proposed method predicted accent nucleus positions for accent phrases with 94.66% accuracy, clearly surpassing the 87.48% accuracy obtained using our rule-based method. A listening experiment was also conducted on synthetic speech obtained using the proposed method and that obtained using the rule-based method. The results show that our method significantly improved the naturalness of synthetic speech.

  • Development and Evaluation of Online Infrastructure to Aid Teaching and Learning of Japanese Prosody Open Access

    Nobuaki MINEMATSU  Ibuki NAKAMURA  Masayuki SUZUKI  Hiroko HIRANO  Chieko NAKAGAWA  Noriko NAKAMURA  Yukinori TAGAWA  Keikichi HIROSE  Hiroya HASHIMOTO  

     
    INVITED PAPER

      Pubricized:
    2016/12/22
      Vol:
    E100-D No:4
      Page(s):
    662-669

    This paper develops an online and freely available framework to aid teaching and learning the prosodic control of Tokyo Japanese: how to generate its adequate word accent and phrase intonation. This framework is called OJAD (Online Japanese Accent Dictionary) [1] and it provides three features. 1) Visual, auditory, systematic, and comprehensive illustration of patterns of accent change (accent sandhi) of verbs and adjectives. Here only the changes caused by twelve fundamental conjugations are focused upon. 2) Visual illustration of the accent pattern of a given verbal expression, which is a combination of a verb and its postpositional auxiliary words. 3) Visual illustration of the pitch pattern of any given sentence and the expected positions of accent nuclei in the sentence. The third feature is technically implemented by using an accent change prediction module that we developed for Japanese Text-To-Speech (TTS) synthesis [2],[3]. Experiments show that accent nucleus assignment to given texts by the proposed framework is much more accurate than that by native speakers. Subjective assessment and objective assessment done by teachers and learners show extremely high pedagogical effectiveness of the developed framework.

  • Non-Native Text-to-Speech Preserving Speaker Individuality Based on Partial Correction of Prosodic and Phonetic Characteristics

    Yuji OSHIMA  Shinnosuke TAKAMICHI  Tomoki TODA  Graham NEUBIG  Sakriani SAKTI  Satoshi NAKAMURA  

     
    PAPER-Speech and Hearing

      Pubricized:
    2016/08/30
      Vol:
    E99-D No:12
      Page(s):
    3132-3139

    This paper presents a novel non-native speech synthesis technique that preserves the individuality of a non-native speaker. Cross-lingual speech synthesis based on voice conversion or Hidden Markov Model (HMM)-based speech synthesis is a technique to synthesize foreign language speech using a target speaker's natural speech uttered in his/her mother tongue. Although the technique holds promise to improve a wide variety of applications, it tends to cause degradation of target speaker's individuality in synthetic speech compared to intra-lingual speech synthesis. This paper proposes a new approach to speech synthesis that preserves speaker individuality by using non-native speech spoken by the target speaker. Although the use of non-native speech makes it possible to preserve the speaker individuality in the synthesized target speech, naturalness is significantly degraded as the synthesized speech waveform is directly affected by unnatural prosody and pronunciation often caused by differences in the linguistic systems of the source and target languages. To improve naturalness while preserving speaker individuality, we propose (1) a prosody correction method based on model adaptation, and (2) a phonetic correction method based on spectrum replacement for unvoiced consonants. The experimental results using English speech uttered by native Japanese speakers demonstrate that (1) the proposed methods are capable of significantly improving naturalness while preserving the speaker individuality in synthetic speech, and (2) the proposed methods also improve intelligibility as confirmed by a dictation test.

  • Business Recovery Conditions of Private Enterprises after the 2011 Great East Japan Earthquake and Issues on Business Continuity Measures for Large-Scale Disaster Management — A Case Study of Small and Medium-Sized Enterprises in Miyagi —

    Norimasa NAKATANI  Osamu MURAO  Kimiro MEGURO  Kiyomine TERUMOTO  

     
    PAPER

      Vol:
    E99-A No:8
      Page(s):
    1539-1550

    Forming Business Continuity Planning (BCP) is recognized as a significant counter-measure against future large-scale disasters by private enterprises after the 2011 Great East Japan Earthquake more than before. Based on a questionnaire survey, this paper reports business recovery conditions of private enterprises in Miyagi Prefecture affected by the disaster. Analyzing the results of questionnaire, it suggests some important points: (1) estimation of long-term internal/external factors that influence business continuity, (2) development of concrete pre-disaster framework, (3) multi-media-based advertising strategy, and (4) re-allocation of resources.

  • Maintenance of Communication Carrier Networks against Large-Scale Earthquakes

    Yoshikazu TAKAHASHI  Daisuke SATOH  

     
    INVITED PAPER

      Vol:
    E98-A No:8
      Page(s):
    1602-1609

    The network operations center of a communication carrier play an important and critical role in the early stage of disaster response because its function is the maintenance of communication services, which includes traffic control and restoration of services. This paper describes traffic control and restoration of services affected by the Great East Japan Earthquake. This paper discusses challenges on traffic congestion and restoration for future large-scale disasters.

  • Development of Wireless Systems for Disaster Recovery Operations Open Access

    Takashi HIROSE  Fusao NUNO  Masashi NAKATSUGAWA  

     
    INVITED PAPER

      Vol:
    E98-C No:7
      Page(s):
    630-635

    This paper presents wireless systems for use in disaster recovery operations. The Great East Japan Earthquake of March 11, 2011 reinforced the importance of communications in, to, and between disaster areas as lifelines. It also revealed that conventional wireless systems used for disaster recovery need to be renovated to cope with technological changes and to provide their services with easier operations. To address this need we have developed new systems, which include a relay wireless system, subscriber wireless systems, business radio systems, and satellite communication systems. They will be chosen and used depending on the situations in disaster areas as well as on the required services.

  • The Challenge of Collaboration among Academies and Asia Pacific for ITS R&D

    Hiroshi MAKINO  Shunsuke KAMIJO  

     
    INVITED PAPER

      Vol:
    E98-A No:1
      Page(s):
    259-266

    ITS R&D includes wide variety of research area such as mechanical engineering, road engineering, traffic engineering, information and communication engineering, and electrical engineering. In spite of initiatives across the variety of engineering is essential to solve the problems of practical social systems, it is difficult to collaborate among engineering. Based on the joint research of the Japan Society of Civil Engineers and the Institute of Electrical Engineers held at the Great East Japan Earthquake, this paper discusses about necessity of collaboration among academies on ITS R&D. International collaboration is also important for ITS R&D. Asian countries could share the same problems and solutions, since many of mega cities exist in Asia region and they suffers from heavy traffics. Therefore, we need to discuss the common solution to our problems.

  • Katakana EdgeWrite: An EdgeWrite Version for Japanese Text Entry

    Kentaro GO  Yuichiro KINOSHITA  

     
    LETTER-Interaction

      Vol:
    E97-D No:8
      Page(s):
    2053-2054

    This paper presents our project of designing EdgeWrite text entry methods for Japanese language. We are developing a version of EdgeWrite text entry method for Japanese language: Katakana EdgeWrite. Katakana EdgeWrite specifies the line stroke directions and writing order of the Japanese Katakana character. The ideal corner sequence pattern of EdgeWrite for each Katakana character is designed based on its line stroke directions and writing order.

  • Application of a Telemedical Tool in an Isolated Island and a Disaster Area of the Great East Japan Earthquake Open Access

    Makoto YOSHIZAWA  Tomoyuki YAMBE  Norihiro SUGITA  Satoshi KONNO  Makoto ABE  Noriyasu HOMMA  Futoshi TAKEI  Katsuhiko YOKOTA  Yoshifumi SAIJO  Shin-ichi NITTA  

     
    INVITED PAPER

      Vol:
    E95-B No:10
      Page(s):
    3067-3073

    The present paper has reported a case study of the “Electronic Doctor's Bag” which is a telemedical tool for home-visit medical services using the mobile communications environment in an isolated island and a disaster area hit by the tsunami. Clinical trials performed for 20 patients around a clinic in Miyako Island indicated that the communication functions of the proposed system were highly evaluated by patients as well as medical staffs. However, the system still has room for further improvement in operability, portability and mobile communication environment. The experience at the shelter in Kesennuma City suggested that mobile healthcare tools such as the proposed system will be strongly required when there are no or only paramedical staffs after leaving of emergency medical staffs.

  • On Tackling Flash Crowds with URL Shorteners and Examining User Behavior after Great East Japan Earthquake

    Takeru INOUE  Shin-ichi MINATO  

     
    PAPER

      Vol:
    E95-B No:7
      Page(s):
    2210-2221

    Several web sites providing disaster-related information failed repeatedly after the Great East Japan Earthquake, due to flash crowds caused by Twitter users. Twitter, which was intensively used for information sharing in the aftermath of the earthquake, relies on URL shorteners like bit.ly to offset its strict limit on message length. In order to mitigate the flash crowds, we examine the current Web usage and find that URL shorteners constitute a layer of indirection; a significant part of Web traffic is guided by them. This implies that flash crowds can be controlled by URL shorteners. We developed a new URL shortener, named rcdn.info, just after the earthquake; rcdn.info redirects users to a replica created on a CoralCDN, if the original site is likely to become overloaded. This surprisingly simple solution worked very well in the emergency. We also conduct a thorough analysis of the request log and present several views that capture user behavior in the emergency from various aspects. Interestingly, the traffic significantly grew up at previously unpopular (i.e., small) sites during the disaster; this traffic shift could lead to the failure of several sites. Finally, we show that rcdn.info has great potential in mitigating such failures. We believe that our experience will help the research community tackle future disasters.

  • A Study on Pitch Patterns in Japanese Speakers of English with Verification by Speech Re-Synthesis

    Tomoko NARIAI  Kazuyo TANAKA  

     
    PAPER-Speech and Hearing

      Vol:
    E94-D No:12
      Page(s):
    2495-2502

    Certain irregularities in the utterances of words or phrases often occur in English spoken by Japanese native subject, referred to in this article as Japanese English. Japanese English is linguistically presumed to reflect the phonetic characteristics of Japanese. We consider the prosodic feature patterns as one of the most common causes of irregularities in Japanese English, and that Japanese English would have better prosodic patterns if its particular characteristics were modified. This study investigates prosodic differences between Japanese English and English speakers' English, and shows the quantitative results of a statistical analysis of pitch. The analysis leads to rules that show how to modify Japanese English to have pitch patterns closer to those of English speakers. On the basis of these rules, the pitch patterns of test speech samples of Japanese English are modified, and then re-synthesized. The modified speech is evaluated in a listening experiment by native English subjects. The result of the experiment shows that on average, over three-fold of the English subjects support the proposed modification against original speech. Therefore, the results of the experiments indicate practical verification of validity of the rules. Additionally, the results suggest that irregularities of prominence lie in Japanese English sentences. This can be explained by the prosodic transfer of first language prosodic characteristics on second language prosodic patterns.

  • Evaluation of SAR and Temperature Elevation Using Japanese Anatomical Human Models for Body-Worn Devices

    Teruo ONISHI  Takahiro IYAMA  Lira HAMADA  Soichi WATANABE  Akimasa HIRATA  

     
    LETTER-Electromagnetic Compatibility(EMC)

      Vol:
    E93-B No:12
      Page(s):
    3643-3646

    This paper investigates the relationship between averaged SAR (Specific Absorption Rate) over 10 g mass and temperature elevation in Japanese numerical anatomical models when devices are mounted on the body. Simplifying the radiation source as a half-wavelength dipole, the generated electrical field and SAR are calculated using the FDTD (Finite-Difference Time-Domain) method. Then the bio-heat equation is solved to obtain the temperature elevation due to the SAR derived using the FDTD method as heat source. Frequencies used in the study are 900 MHz and 1950 MHz, which are used for mobile phones. In addition, 3500 MHz is considered because this frequency is reserved for IMT-Advanced (International Mobile Telecommunication-Advanced System). Computational results obtained herein show that the 10 g-average SAR and the temperature elevation are not proportional to frequency. In addition, it is clear that those at 3500 MHz are lower than that at 1950 MHz even though the frequency is higher. It is the point to be stressed here is that good correlation between the 10 g-average SAR and the temperature elevation is observed even for the body-worn device.

  • A Survey of the Origins and Evolution of the Microwave Circuit Devices in Japan from the 1920s up until 1945

    Tosiro KOGA  

     
    INVITED SURVEY PAPER

      Vol:
    E93-A No:12
      Page(s):
    2354-2370

    We edit in this paper several archives on the research and development in the field of microwave circuit technology in Japan, that originated with the invention of Yagi-Uda antenna in 1925, together with generally unknown historical topics in the period from the 1920s up until the end of World War II. As the main subject, we investigate the origin and evolution of the Multiply Split-Anode Magnetron, and clarify that the basic magnetron technology had been established until 1939 under the direction of Yoji Ito in cooperation of expert engineers between the Naval Technical Institute (NTI) and the Nihon Musen Co., while the Cavity Magnetron was invented by Shigeru Nakajima of the Nihon Musen Co. in May 1939, and further that physical theory of the Multiply Split-Anode Cavity Magnetron Oscillation and the design theory of the Cavity Magnetron were established in collaboration between the world-known physicists and the expert engineers at the NTI Shimada Laboratory in the wartime. In addition, we clarify that Sin-itiro Tomonaga presented the Scattering Matrix representation of Microwave Circuits, and others. The development mentioned above was carried out, in strict secrecy, in an unusual wartime situation up until 1945.

  • Estimation of Speech Intelligibility Using Speech Recognition Systems

    Yusuke TAKANO  Kazuhiro KONDO  

     
    PAPER-Speech and Hearing

      Vol:
    E93-D No:12
      Page(s):
    3368-3376

    We attempted to estimate subjective scores of the Japanese Diagnostic Rhyme Test (DRT), a two-to-one forced selection speech intelligibility test. We used automatic speech recognizers with language models that force one of the words in the word-pair, mimicking the human recognition process of the DRT. Initial testing was done using speaker-independent models, and they showed significantly lower scores than subjective scores. The acoustic models were then adapted to each of the speakers in the corpus, and then adapted to noise at a specified SNR. Three different types of noise were tested: white noise, multi-talker (babble) noise, and pseudo-speech noise. The match between subjective and estimated scores improved significantly with noise-adapted models compared to speaker-independent models and the speaker-adapted models, when the adapted noise level and the tested level match. However, when SNR conditions do not match, the recognition scores degraded especially when tested SNR conditions were higher than the adapted noise level. Accordingly, we adapted the models to mixed levels of noise, i.e., multi-condition training. The adapted models now showed relatively high intelligibility matching subjective intelligibility performance over all levels of noise. The correlation between subjective and estimated intelligibility scores increased to 0.94 with multi-talker noise, 0.93 with white noise, and 0.89 with pseudo-speech noise, while the root mean square error (RMSE) reduced from more than 40 to 13.10, 13.05 and 16.06, respectively.

  • Unsupervised Speaker Adaptation Using Speaker-Class Models for Lecture Speech Recognition

    Tetsuo KOSAKA  Yuui TAKEDA  Takashi ITO  Masaharu KATO  Masaki KOHDA  

     
    PAPER-Adaptation

      Vol:
    E93-D No:9
      Page(s):
    2363-2369

    In this paper, we propose a new speaker-class modeling and its adaptation method for the LVCSR system and evaluate the method on the Corpus of Spontaneous Japanese (CSJ). In this method, closer speakers are selected from training speakers and the acoustic models are trained by using their utterances for each evaluation speaker. One of the major issues of the speaker-class model is determining the selection range of speakers. In order to solve the problem, several models which have a variety of speaker range are prepared for each evaluation speaker in advance, and the most proper model is selected on a likelihood basis in the recognition step. In addition, we improved the recognition performance using unsupervised speaker adaptation with the speaker-class models. In the recognition experiments, a significant improvement could be obtained by using the proposed speaker adaptation based on speaker-class models compared with the conventional adaptation method.

1-20hit(36hit)