IEICE global.ieice.org Site

Author Search Result

[Author] Makoto OTA(5hit)

1-5hit

Sound Image Localization Using Dynamic Transaural Reproduction with Non-contact Head Tracking
Hiroaki KURABAYASHI Makoto OTANI Kazunori ITOH Masami HASHIMOTO Mizue KAYAMA

PAPER

Vol:
E97-A No:9
Page(s):
1849-1858
Binaural reproduction is one of the promising approaches to present a highly realistic virtual auditory space to a listener. Generally, binaural signals are reproduced using a set of headphones that leads to a simple implementation of such a system. In contrast, binaural signals can be presented to a listener using a technique called “transaural reproduction” which employs a few loudspeakers with crosstalk cancellation for compensating acoustic transmissions from the loudspeakers to both ears of the listener. The major advantage of transaural reproduction is that a listener is able to experience binaural reproduction without wearing any device. This leads to a more natural listening environment. However, in transaural reproduction, the listener is required to be still within a very narrow sweet spot because the crosstalk canceller is very sensitive to the listener's head position and orientation. To solve this problem, dynamic transaural systems have been developed by utilizing contact type head tracking. This paper introduces the development of a dynamic transaural system with non-contact head tracking which releases the listener from any attachment, thereby preserving the advantage of transaural reproduction. Experimental results revealed that sound images presented in the horizontal and median planes were localized more accurately when the system tracked the listener's head rotation than when the listeners did not rotate their heads or when the system did not track the listener's head rotation. These results demonstrate that the system works effectively and correctly with the listener's head rotation.
Auditory Artifacts due to Switching Head-Related Transfer Functions of a Dynamic Virtual Auditory Display
Makoto OTANI Tatsuya HIRAHARA

PAPER

Vol:
E91-A No:6
Page(s):
1320-1328
Auditory artifacts due to switching head-related transfer functions (HRTFs) are investigated, using a software-implemented dynamic virtual auditory display (DVAD) developed by the authors. The DVAD responds to a listener's head rotation using a head-tracking device and switching HRTFs to present a highly realistic 3D virtual auditory space to the listener. The DVAD operates on Windows XP and does not require high-performance computers. A total system latency (TSL), which is the delay between head motion and the corresponding change of the ear input signal, is a significant factor of DVADs. The measured TSL of our DVAD is about 50 ms, which is sufficient for practical applications and localization experiments. Another matter of concern is the auditory artifact in DVADs caused by switching HRTFs. Switching HRTFs gives rise to wave discontinuity of synthesized binaural signals, which can be perceived as click noises that degrade the quality of presented sound image. A subjective test and excitation patterns (EPNs) analysis using an auditory filter are performed with various source signals and HRTF spatial resolutions. The results of the subjective test reveal that click noise perception depends on the source signal and the HRTF spatial resolution. Furthermore, EPN analysis reveals that switching HRTFs significantly distorts the EPNs at the off signal frequencies. Such distortions, however, are masked perceptually by broad-bandwidth source signals, whereas they are not masked by narrow-bandwidth source signals, thereby making the click noise more detectable. A higher HRTF spatial resolution leads to smaller distortions. But, depending on the source signal, perceivable click noises still remain even with 0.5-degree spatial resolution, which is less than minimum audible angle (1 degree in front).
Improvement of Active Net Model for Region Detection in an Image
Noboru YABUKI Yoshitaka MATSUDA Makoto OTA Yasuaki SUMI Yutaka FUKUI Shigehiko MIKI

PAPER

Vol:
E84-A No:3
Page(s):
720-726
Processes in image recognition include target detection and shape extraction. Active Net has been proposed as one of the methods for such processing. It treats the target detection in an image as an energy optimization problem. In this paper, a problem of the conventional Active Net is presented and the new Active Net is proposed. The new net is improved the ability for detecting a target. Finally, the validity of the proposed net is confirmed by experimental results.
The Automatic Counting of Chlorella Using Image Processing and Neural Network
Yasuaki SUMI Makoto OTA Noboru YABUKI Shigeki OBOTE Yoshitaka MATSUDA Yutaka FUKUI

LETTER

Vol:
E84-A No:3
Page(s):
794-796
In the culture of marine chlorellas, it is necessary to count the number in order to understand the condition of increase. For that propose, counting by the naked eye using the microscope has been used. However, this method requires a lot of time and work. We have developed the automatic chlorella counter using image processing and neural network. Its effectiveness is confirmed through the experiment.
Numerical Simulation of Air Flow through Glottis during Very Weak Whisper Sound Production
Makoto OTANI Tatsuya HIRAHARA

PAPER-Speech and Hearing

Vol:
E94-A No:9
Page(s):
1779-1785
A non-audible murmur (NAM), a very weak whisper sound produced without vocal fold vibration, has been researched in the development of a silent-speech communication tool for functional speech disorders as well as human-to-human/machine interfaces with inaudible voice input. The NAM can be detected using a specially designed microphone, called a NAM microphone, attached to the neck. However, the detected NAM signal has a low signal-to-noise ratio and severely suppressed high-frequency component. To improve NAM clarity, the mechanism of a NAM production must be clarified. In this work, an air flow through a glottis in the vocal tract was numerically simulated using computational fluid dynamics and vocal tract shape models that are obtained by a magnetic resonance imaging (MRI) scan for whispered voice production with various strengths, i.e. strong, weak, and very weak. For a very weak whispering during the MRI scan, subjects were trained, just before the scanning, to produce the very weak whispered voice, or the NAM. The numerical results show that a weak vorticity flow occurs in the supraglottal region even during a very weak whisper production; such vorticity flow provide aeroacoustic sources for a very weak whispering, i.e. NAM, as in an ordinary whispering.

Author Search Result

[Author] Makoto OTA(5hit)

Sound Image Localization Using Dynamic Transaural Reproduction with Non-contact Head Tracking

Auditory Artifacts due to Switching Head-Related Transfer Functions of a Dynamic Virtual Auditory Display

Improvement of Active Net Model for Region Detection in an Image

The Automatic Counting of Chlorella Using Image Processing and Neural Network

Numerical Simulation of Air Flow through Glottis during Very Weak Whisper Sound Production

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles