IEICE global.ieice.org Site

Author Search Result

[Author] Masafumi NISHIDA(2hit)

1-2hit

Phoneme Set Design for Speech Recognition of English by Japanese
Xiaoyun WANG Jinsong ZHANG Masafumi NISHIDA Seiichi YAMAMOTO

PAPER-Speech and Hearing

Pubricized:
2014/10/01
Vol:
E98-D No:1
Page(s):
148-156
This paper describes a novel method to improve the performance of second language speech recognition when the mother tongue of users is known. Considering that second language speech usually includes less fluent pronunciation and more frequent pronunciation mistakes, the authors propose using a reduced phoneme set generated by a phonetic decision tree (PDT)-based top-down sequential splitting method instead of the canonical one of the second language. The authors verify the efficacy of the proposed method using second language speech collected with a translation game type dialogue-based English CALL system. Experiments show that a speech recognizer achieved higher recognition accuracy with the reduced phoneme set than with the canonical phoneme set.
Daily Activity Recognition with Large-Scaled Real-Life Recording Datasets Based on Deep Neural Network Using Multi-Modal Signals
Tomoki HAYASHI Masafumi NISHIDA Norihide KITAOKA Tomoki TODA Kazuya TAKEDA

PAPER-Engineering Acoustics

Vol:
E101-A No:1
Page(s):
199-210
In this study, toward the development of smartphone-based monitoring system for life logging, we collect over 1,400 hours of data by recording including both the outdoor and indoor daily activities of 19 subjects, under practical conditions with a smartphone and a small camera. We then construct a huge human activity database which consists of an environmental sound signal, triaxial acceleration signals and manually annotated activity tags. Using our constructed database, we evaluate the activity recognition performance of deep neural networks (DNNs), which have achieved great performance in various fields, and apply DNN-based adaptation techniques to improve the performance with only a small amount of subject-specific training data. We experimentally demonstrate that; 1) the use of multi-modal signal, including environmental sound and triaxial acceleration signals with a DNN is effective for the improvement of activity recognition performance, 2) the DNN can discriminate specified activities from a mixture of ambiguous activities, and 3) DNN-based adaptation methods are effective even if only a small amount of subject-specific training data is available.

Author Search Result

[Author] Masafumi NISHIDA(2hit)

Phoneme Set Design for Speech Recognition of English by Japanese

Daily Activity Recognition with Large-Scaled Real-Life Recording Datasets Based on Deep Neural Network Using Multi-Modal Signals

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles