1-3hit |
Yuki YAI Shigeki MIYABE Hiroshi SARUWATARI Kiyohiro SHIKANO Yosuke TATEKURA
In this paper, we propose a computationally efficient method of compensating temperature for the transaural stereo. The conventional method can be used to estimate the change in impulse responses caused by the fluctuation of temperature with high accuracy. However, the large amount of computation required makes real-time implementation difficult. Focusing on the fact that the amount of compensation depends on the length of the impulse response, we reduce the computation required by segmenting the impulse response. We segment the impulse responses in the time domain and estimate the effect of temperature fluctuation for each of the segments. By joining the processed segments, we obtain the compensated impulse response of the whole length. Experimental results show that the proposed method can reduce the computation required by a factor of nine without degradation of the accuracy.
Yuya SUGIMOTO Shigeki MIYABE Takeshi YAMADA Shoji MAKINO Biing-Hwang JUANG
MUltiple SIgnal Classification (MUSIC) is a standard technique for direction of arrival (DOA) estimation with high resolution. However, MUSIC cannot estimate DOAs accurately in the case of underdetermined conditions, where the number of sources exceeds the number of microphones. To overcome this drawback, an extension of MUSIC using cumulants called 2q-MUSIC has been proposed, but this method greatly suffers from the variance of the statistics, given as the temporal mean of the observation process, and requires long observation. In this paper, we propose a new approach for extending MUSIC that exploits higher-order moments of the signal for the underdetermined DOA estimation with smaller variance. We propose an estimation algorithm that nonlinearly maps the observed signal onto a space with expanded dimensionality and conducts MUSIC-based correlation analysis in the expanded space. Since the dimensionality of the noise subspace is increased by the mapping, the proposed method enables the estimation of DOAs in the case of underdetermined conditions. Furthermore, we describe the class of mapping that allows us to analyze the higher-order moments of the observed signal in the original space. We compare 2q-MUSIC and the proposed method through an experiment assuming that the true number of sources is known as prior information to evaluate in terms of the bias-variance tradeoff of the statistics and computational complexity. The results clarify that the proposed method has advantages for both computational complexity and estimation accuracy in short-time analysis, i.e., the time duration of the analyzed data is short.
Shigeki MIYABE Hiroshi SARUWATARI Kiyohiro SHIKANO Yosuke TATEKURA
In this paper, we describe a new interface for a barge-in free spoken dialogue system combining multichannel sound field control and beamforming, in which the response sound from the system can be canceled out at the microphone points. The conventional method inhibits a user from moving because the system forces the user to stay at a fixed position where the response sound is reproduced. However, since the proposed method does not set control points for the reproduction of the response sound to the user, the user is allowed to move. Furthermore, the relaxation of strict reproduction for the response sound enables us to design a stable system with fewer loudspeakers than those used in the conventional method. The proposed method shows a higher performance in speech recognition experiments.