The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] microphone(72hit)

61-72hit(72hit)

  • Speech Enhancement by Profile Fitting Method

    Osamu ICHIKAWA  Tetsuya TAKIGUCHI  Masafumi NISHIMURA  

     
    PAPER-Robust Speech Recognition and Enhancement

      Vol:
    E86-D No:3
      Page(s):
    514-521

    It is believed that distant-talking speech recognition in a noisy environment requires a large-scale microphone array. However, this cannot fit into small consumer devices. Our objective is to improve the performance with a limited number of microphones (preferably only left and right). In this paper, we focused on a profile that is the shape of the power distribution according to the beamforming direction. An observed profile can be decomposed into known profiles for directional sound sources and a non-directional background sound source. Evaluations confirmed this method reduced the CER (Character Error Ratio) for the dictation task by more than 20% compared to a conventional 2-channel Adaptive Spectral Subtraction beamformer in a non-reverberant environment.

  • A Microphone Array-Based 3-D N-Best Search Method for Recognizing Multiple Sound Sources

    Panikos HERACLEOUS  Satoshi NAKAMURA  Takeshi YAMADA  Kiyohiro SHIKANO  

     
    PAPER-Speech and Hearing

      Vol:
    E85-D No:6
      Page(s):
    994-1002

    This paper describes a method for hands-free speech recognition, and particularly for the simultaneous recognition of multiple sound sources. The method is based on the 3-D Viterbi search, i.e., extended to the 3-D N-best search method enabling the recognition of multiple sound sources. The baseline system integrates two existing technologies--3-D Viterbi search and conventional N-best search--into a complete system. Previously, the first evaluation of the 3-D N-best search-based system showed that new ideas are necessary to develop a system for the simultaneous recognition of multiple sound sources. It found two factors that play important roles in the performance of the system, namely the different likelihood ranges of the sound sources and the direction-based separation of the hypotheses. In order to solve these problems, we implemented a likelihood normalization and a path distance-based clustering technique into the baseline 3-D N-best search-based system. The performance of our system was evaluated through experiments on simulated data for the case of two talkers. The experiments showed significant improvements by implementing the above two techniques. The best results were obtained by implementing the two techniques and using a microphone array composed of 32 channels. More specifically, the Word Accuracy for the two talkers was higher than 80% and the Simultaneous Word Accuracy (where both sources are correctly recognized simultaneously) was higher than 70%, which are very promising results.

  • Direction of Arrival Estimation Using Nonlinear Microphone Array

    Hidekazu KAMIYANAGIDA  Hiroshi SARUWATARI  Kazuya TAKEDA  Fumitada ITAKURA  Kiyohiro SHIKANO  

     
    PAPER

      Vol:
    E84-A No:4
      Page(s):
    999-1010

    This paper describes a new method for estimating the direction of arrival (DOA) using a nonlinear microphone array system based on complementary beamforming. Complementary beamforming is based on two types of beamformers designed to obtain complementary directivity patterns with respect to each other. In this system, since the resultant directivity pattern is proportional to the product of these directivity patterns, the proposed method can be used to estimate DOAs of 2(K-1) sound sources with K-element microphone array. First, DOA-estimation experiments are performed using both computer simulation and actual devices in real acoustic environments. The results clarify that DOA estimation for two sound sources can be accomplished by the proposed method with two microphones. Also, by comparing the resolutions of DOA estimation by the proposed method and by the conventional minimum variance method, we can show that the performance of the proposed method is superior to that of the minimum variance method under all reverberant conditions.

  • Sharp Directivity Function Based on Fourier Series Expansion and Its Directional System Realization with Small Number of Microphones

    Masataka NAKAMURA  Toshitaka YAMATO  Katsuhito KOUNO  Atsuyuki TAKASHIMA  

     
    PAPER

      Vol:
    E84-A No:4
      Page(s):
    975-983

    In order that speech recognition system may have a high recognition rate in a noisy environment, a wide-band sharp directional microphone system is required at the input for securing a high S/N ratio. The authors have already reported the realization of a wide-band uni-directional microphone system by three-microphone integration method. In this paper, we intend to describe the derivation of a sharp directivity function and the realization of its microphone system. First, setting the shape of the characteristic function to bring a sharp directional pattern and then expanding it into the Fourier series, we derive a new directivity function. Next, on the basis of this directivity function, we will present a sharp directional microphone system with only three non-directional microphones and the subsequent analog signal processing. And also, the directional pattern acquired by the proposed method and the effect of the dispersion in the sensitivity of the constituent microphones on the directivity are discussed in detail.

  • A New Adaptation-Mode Control Based on Cross Correlation for a Robust Adaptive Microphone Array

    Osamu HOSHUYAMA  Brigitte BEGASSE  Akihiko SUGIYAMA  

     
    PAPER-Microphone Array

      Vol:
    E84-A No:2
      Page(s):
    406-413

    This paper proposes a new adaptation-mode control (AMC) for a robust adaptive microphone array with an adaptive blocking matrix (RAMA-ABM). The proposed AMC is based on cross correlations of two microphone signals and uses a state machine for controlling the adaptation to avoid target-signal cancellation. Evaluation with sound data obtained in different acoustic environments demonstrates that the noise reduction by the proposed AMC is 3 dB better than that by the AMC based on the SNR estimate. Subjective listening tests show that the quality of the output signal by the proposed AMC is comparable to or even better than those by the conventional AMCs.

  • Sound Source Localization and Separation in Near Field

    Futoshi ASANO  Hideki ASOH  Toshihiro MATSUI  

     
    PAPER-Engineering Acoustics

      Vol:
    E83-A No:11
      Page(s):
    2286-2294

    As a preprocessor of the automatic speech recognizer in a noisy environment, a microphone array system has been investigated to reduce the environmental noise. In usual microphone array design, a plane wave is assumed for the sake of simplicity (far-field assumption). However, this far-field assumption does not always hold, resulting in distortion in the array output. In this report, the subspace method, which is one of the high resolution spectrum estimator, is applied to the near-field source localization problem. A high resolution method is necessary especially for the near-field source localization with a small-sized array. By combining the source localization technique with a spatial inverse filter, the signal coming from the multiple sources in the near-field range can be separated. The modified minimum variance beamformer is used to design the spatial inverse filter. As a result of the experiment in a real environment with two sound sources in the near-field range, 60-70% of word recognition rate was achieved.

  • Speech Enhancement Using Nonlinear Microphone Array Based on Noise Adaptive Complementary Beamforming

    Hiroshi SARUWATARI  Shoji KAJITA  Kazuya TAKEDA  Fumitada ITAKURA  

     
    PAPER-Engineering Acoustics

      Vol:
    E83-A No:5
      Page(s):
    866-876

    This paper describes an improved complementary beamforming microphone array based on the new noise adaptation algorithm. Complementary beamforming is based on two types of beamformers designed to obtain complementary directivity patterns with respect to each other. In this system, during a pause in the target speech, two directivity patterns of the beamformers are adapted to the noise directions of arrival so that the expectation values of each noise power spectrum are minimized in the array output. Using this technique, we can realize the directional nulls for each noise even when the number of sound sources exceeds that of microphones. To evaluate the effectiveness, speech enhancement experiments and speech recognition experiments are performed based on computer simulations with a two-element array and three sound sources under various noise conditions. In comparison with the conventional adaptive beamformer and the conventional spectral subtraction method cascaded with the adaptive beamformer, it is shown that (1) the proposed array improves the signal-to-noise ratio (SNR) of degraded speech by more than 6 dB when the interfering noise is two speakers with the input SNR of below 0 dB, (2) the proposed array improves the SNR by about 2 dB when the interfering noise is bubble noise, and (3) an improvement in the recognition rate of more than 18% is obtained when the interfering noise is two speakers or two overlapped signals of some speakers under the condition that the input SNR is 10 dB.

  • Speech Enhancement Using Nonlinear Microphone Array Based on Complementary Beamforming

    Hiroshi SARUWATARI  Shoji KAJITA  Kazuya TAKEDA  Fumitada ITAKURA  

     
    PAPER

      Vol:
    E82-A No:8
      Page(s):
    1501-1510

    This paper describes a spatial spectral subtraction method by using the complementary beamforming microphone array to enhance noisy speech signals for speech recognition. The complementary beamforming is based on two types of beamformers designed to obtain complementary directivity patterns with respect to each other. In this paper, it is shown that the nonlinear subtraction processing with complementary beamforming can result in a kind of the spectral subtraction without the need for speech pause detection. In addition, the optimization algorithm for the directivity pattern is also described. To evaluate the effectiveness, speech enhancement experiments and speech recognition experiments are performed based on computer simulations under both stationary and nonstationary noise conditions. In comparison with the optimized conventional delay-and-sum (DS) array, it is shown that: (1) the proposed array improves the signal-to-noise ratio (SNR) of degraded speech by about 2 dB and performs more than 20% better in word recognition rates under the conditions that the white Gaussian noise with the input SNR of -5 or -10 dB is used, (2) the proposed array performs more than 5% better in word recognition rates under the nonstationary noise conditions. Also, it is shown that these improvements of the proposed array are same as or superior to those of the conventional spectral subtraction method cascaded with the DS array.

  • A Robust Adaptive Beamformer with a Blocking Matrix Using Coefficient-Constrained Adaptive Filters

    Osamu HOSHUYAMA  Akihiko SUGIYAMA  Akihiro HIRANO  

     
    PAPER-Digital Signal Processing

      Vol:
    E82-A No:4
      Page(s):
    640-647

    This paper proposes a new robust adaptive beamformer applicable to microphone arrays. The proposed beamformer is a generalized sidelobe canceller (GSC) with a variable blocking matrix using coefficient-constrained adaptive filters (CCAFs). The CCAFs, whose common input signal is the output of a fixed beamformer, minimize leakage of the target signal into the interference path of the GSC. Each coefficient of the CCAFs is constrained to avoid mistracking. In the multiple-input canceller, leaky adaptive filters are used to decrease undesirable target-signal cancellation. The proposed beamformer can allow large look-direction error with almost no degradation in interference-reduction performance and can be implemented with a small number of microphones. The maximum allowable look-direction error can be specified by the user. Simulation results show that the proposed beamformer, when designed to allow about 20of look-direction error, can suppress interference by more than 17 dB.

  • Realization of Wide-Band Directivity with Three Microphones

    Masataka NAKAMURA  Katsuhito KOUNO  Toshitaka YAMATO  Kazuhiro SAKIYAMA  

     
    PAPER

      Vol:
    E82-A No:4
      Page(s):
    619-625

    In order that the speech recognition system might have a high performance in the noisy environment, the directional microphone arrays at the input of the system have been broadly investigated. The purpose of this study is to develop a new wide-band directional microphone system in view of advancing to an adaptive one afterwards. In the proposed system, three microphones are arranged on a straight line and the beamforming is accomplished in such a way that the output value of the middle microphone is added to the integrated value of the difference between two microphones at both sides. In this study, the signal processing of microphone outputs is implemented by using active RC circuits. Finally, the objective directivity can be experimentally obtained in wide frequency ranges required for the speech recognition.

  • New Design Method of a Binaural Microphone Array Using Multiple Constraints

    Yoiti SUZUKI  Shinji TSUKUI  Futoshi ASANO  Ryouichi NISHIMURA  Toshio SONE  

     
    PAPER

      Vol:
    E82-A No:4
      Page(s):
    588-596

    A new method of designing a microphone array with two outputs preserving binaural information is proposed in this paper. This system employs adaptive beamforming using multiple constraints. The binaural cues may be preserved in the two outputs by use of these multiple constraints with simultaneous beamforming to enhance target signals is also available. A computer simulation was conducted to examine the performance of the beamforming. The results showed that the proposed array can perform both the generation of the binaural cues and the beamforming as intended. In particular, beamforming with double-constraints exhibits the best performance; DI is around 7 dB and good interchannel (interaural) time/phase and level differences are generated within a target region in front. With triple-constraints, however, the performance of the beamforming becomes poorer while the binaural information is better realized. Setting of the desired responses to give proper binaural information seems to become critical as the number of the constraints increases.

  • Realization of Acoustic Inverse Filtering through Multi-Microphone Sub-Band Processing

    Hong WANG  Fumitada ITAKURA  

     
    PAPER

      Vol:
    E75-A No:11
      Page(s):
    1474-1483

    The realization of acoustic inverse filter is often difficult because of the non-minimum phase property and the long time duration of the impulse response of the acoustic enclosure. However, if the signals are divided into a large number of sub-bands, many of the sub-bands are found to be invertible. The invertibility of a sub-band signal depends on the zero distribution of the transfer function in the z-plane. In a multi-microphone system, the transfer functions between the sound source and the mirophones have different zero distributions. The method proposed here, taking advantage of the differences of zero distributions, selects the best invertible microphone in each sub-band, and reconstructs the full band signal by summing up the inverse filtered sub-band signals of the best microphones. The quality of the dereverberated signal using the proposed inverse filtering approach is improved with increasing number of microphones and sub-bands. When seven microphones are used and the number of sub-bands is 513, the quality of the dereverberated speech signals are almost the same with the original ones even when the revergeration time is about one second. The introduction of multi-microphones in addition to sub-band processing provides a new way of dealing with the non-minimum phase problem in deconvolution.

61-72hit(72hit)