1-5hit |
Hirofumi NAKAJIMA Keiko KIKUCHI Kazuhiro NAKADAI Yutaka KANEDA
This paper proposes a sound source orientation estimation method that is suitable for a distributed microphone arrangement. The proposed method is based on orientation-extended beamforming (OEBF), which has four features: (a) robustness against reverberations, (b) robustness against noises, (c) free arrangements of microphones and (d) feasibility for real-time processing. In terms of (a) and (c), since OEBF is based on a general propagation model using transfer functions (TFs) that include all propagation phenomena such as reflections and diffractions, OEBF causes no model errors for the propagation phenomena, and is applicable to arbitrary microphone arrangements. Regarding (b), OEBF overcomes noise effects by incorporating three additional processes (Amplitude extraction, time-frequency mask and histogram integration) that are also proposed in this paper. As for (d), OEBF is executable in real-time basis as the execution process is the same as usual beamforming processes. A numerical experiment was performed to confirm the theoretical validity of OEBF. The results showed that OEBF was able to estimate sound source positions and orientations very precisely. Practical experiments were carried out using a 96-channel microphone array in real environments. The results indicated that OEBF worked properly even under reverberant and noisy environments and the averaged estimation error was given only 4°.
This paper proposes a new adaptive algorithm for acoustic echo cancellers with four times the convergence speed for a speech input, at almost the same computational load, of the normalized LMS (NLMS). This algorithm reflects both the statistics of the variation of a room impulse response and the whitening of the received input signal. This algorithm, called the ESP (exponentially weighted step-size projection) algorithm, uses a different step size for each coefficient of an adaptive transversal filter. These step sizes are time-invariant and weighted proportional to the expected variation of a room impulse response. As a result, the algorithm adjusts coefficients with large errors in large steps, and coefficients with small errors in small steps. The algorithm is based on the fact that the expected variation of a room impulse response becomes progressively smaller along the series by the same exponential ratio as the impulse response energy decay. This algorithm also reflects the whitening of the received input signal, i.e., it removes the correlation between consecutive received input vectors. This process is effective for speech, which has a highly non-white spectrum. A geometric interpretation of the proposed algorithm is derived and the convergence condition is proved. A fast profection algorithm is introduced to reduce the computational complexity and modified for a practical multiple DSP structure so that it requires almost the same computational load, 2L multiply-add operations, as the conventional NLMS. The algorithm is implemented in an acoustic echo canceller constructed with multiple DSP chips, and its fast convergence is demonstrated.
Masashi TANAKA Yutaka KANEDA Shoji MAKINO Junji KOJIMA
This paper proposes a new algorithm called the fast Projection algorithm, which reduces the computational complexity of the Projection algorithm from (p+1)L+O(p3) to 2L+20p (where L is the length of the estimation filter and p is the projection order.) This algorithm has properties that lie between those of NLMS and RLS, i.e. less computational complexity than RLS but much faster convergence than NLMS for input signals like speech. The reduction of computation consists of two parts. One concerns calculating the pre-filtering vector which originally took O(p3) operations. Our new algorithm computes the pre-filtering vector recursively with about 15p operations. The other reduction is accomplished by introducing an approximation vector of the estimation filter. Experimental results for speech input show that the convergence speed of the Projection algorithm approaches that of RLS as the projection order increases with only a slight extra calculation complexity beyond that of NLMS, which indicates the efficiency of the proposed fast Projection algorithm.
Sumitaka SAKAUCHI Yoichi HANEDA Shoji MAKINO Masashi TANAKA Yutaka KANEDA
We investigated the dependence of the desired echo return loss on frequency for various hands-free telecommunication conditions by subjective assessment. The desired echo return loss as a function of frequency (DERLf) is an important factor in the design and performance evaluation of a subband echo canceller, and it is a measure of what is considered an acceptable echo caused by electrical loss in the transmission line. The DERLf during single-talk was obtained as attenuated band-limited echo levels that subjects did not find objectionable when listening to the near-end speech and its band-limited echo under various hands-free telecommunication conditions. When we investigated the DERLf during double-talk, subjects also heard the speech in the far-end room from a loudspeaker. The echo was limited to a 250-Hz bandwidth assuming the use of a subband echo canceller. The test results showed that: (1) when the transmission delay was short (30 ms), the echo component around 2 to 3 kHz was the most objectionable to listeners; (2) as the transmission delay rose to 300 ms, the echo component around 1 kHz became the most objectionable; (3) when the room reverberation time was relatively long (about 500 ms), the echo component around 1 kHz was the most objectionable, even if the transmission delay was short; and (4) the DERLf during double-talk was about 5 to 10 dB lower than that during single-talk. Use of these DERLf values will enable the design of more efficient subband echo cancellers.
A new method is proposed for recovering an unknown source signal ,which is observed through two unknown channels characterized by non-minimum phase FIR filters. Conventional methods cannot estimate the non-minimum phase parts and recover the source signal. Our method is based on computing the eigenvector corresponding to the smallest eigenvalue of the input correlation matrix and using the criterion with the multi-channnel inverse filtering theory. The impulse responses are estimated by computing the eigenvector for all modeling orders. The optimum order is searched for using the criterion and the most appropriate impulse responses are estimated. Multi-channel inverse filtering with the estimated impulse responses is used to recover the unknown source signal. Computer simulation shows that our method can estimate nonminimum phase impulse responses from two reverberant signals and recover the source signal.