The search functionality is under construction.

Author Search Result

[Author] Masafumi NISHIDA(2hit)

1-2hit
  • Phoneme Set Design for Speech Recognition of English by Japanese

    Xiaoyun WANG  Jinsong ZHANG  Masafumi NISHIDA  Seiichi YAMAMOTO  

     
    PAPER-Speech and Hearing

      Pubricized:
    2014/10/01
      Vol:
    E98-D No:1
      Page(s):
    148-156

    This paper describes a novel method to improve the performance of second language speech recognition when the mother tongue of users is known. Considering that second language speech usually includes less fluent pronunciation and more frequent pronunciation mistakes, the authors propose using a reduced phoneme set generated by a phonetic decision tree (PDT)-based top-down sequential splitting method instead of the canonical one of the second language. The authors verify the efficacy of the proposed method using second language speech collected with a translation game type dialogue-based English CALL system. Experiments show that a speech recognizer achieved higher recognition accuracy with the reduced phoneme set than with the canonical phoneme set.

  • Daily Activity Recognition with Large-Scaled Real-Life Recording Datasets Based on Deep Neural Network Using Multi-Modal Signals

    Tomoki HAYASHI  Masafumi NISHIDA  Norihide KITAOKA  Tomoki TODA  Kazuya TAKEDA  

     
    PAPER-Engineering Acoustics

      Vol:
    E101-A No:1
      Page(s):
    199-210

    In this study, toward the development of smartphone-based monitoring system for life logging, we collect over 1,400 hours of data by recording including both the outdoor and indoor daily activities of 19 subjects, under practical conditions with a smartphone and a small camera. We then construct a huge human activity database which consists of an environmental sound signal, triaxial acceleration signals and manually annotated activity tags. Using our constructed database, we evaluate the activity recognition performance of deep neural networks (DNNs), which have achieved great performance in various fields, and apply DNN-based adaptation techniques to improve the performance with only a small amount of subject-specific training data. We experimentally demonstrate that; 1) the use of multi-modal signal, including environmental sound and triaxial acceleration signals with a DNN is effective for the improvement of activity recognition performance, 2) the DNN can discriminate specified activities from a mixture of ambiguous activities, and 3) DNN-based adaptation methods are effective even if only a small amount of subject-specific training data is available.