The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] microphone(72hit)

41-60hit(72hit)

  • Separation of Sound Sources Propagated in the Same Direction

    Akio ANDO  Masakazu IWAKI  Kazuho ONO  Koichi KUROZUMI  

     
    PAPER-Blind Source Separation

      Vol:
    E88-A No:7
      Page(s):
    1665-1672

    This paper describes a method for separating a target sound from other noise arriving in a single direction when the target cannot, therefore, be separated by directivity control. Microphones are arranged in a line toward the sources to form null sensitivity points at given distances from the microphones. The null points exclude non-target sound sources on the basis of weighting coefficients for microphone outputs determined by blind source separation. The separation problem is thereby simplified to instantaneous separation by adjustment of the time-delays for microphone outputs. The system uses a direct (i.e. non-iterative) algorithm for blind separation based on second-order statistics, assuming that all sources are non-stationary signals. Simulations show that the 2-microphone system can separate a target sound with separability of more than 40 dB for the 2-source problem, and 25 dB for the 3-source problem when the other sources are adjacent.

  • Interface for Barge-in Free Spoken Dialogue System Combining Adaptive Sound Field Control and Microphone Array

    Tatsunori ASAI  Hiroshi SARUWATARI  Kiyohiro SHIKANO  

     
    LETTER-Speech and Hearing

      Vol:
    E88-A No:6
      Page(s):
    1613-1618

    This paper describes a new interface for a barge-in free spoken dialogue system combining an adaptive sound field control and a microphone array. In order to actualize robustness against the change of transfer functions due to the various interferences, the barge-in free spoken dialogue system which uses sound field control and a microphone array has been proposed by one of the authors. However, this method cannot follow the change of transfer functions because the method consists of fixed filters. To solve the problem, we introduce a new adaptive sound field control that follows the change of transfer functions.

  • Adaptive Microphone Array System with Two-Stage Adaptation Mode Controller

    Yang-Won JUNG  Hong-Goo KANG  Chungyong LEE  Dae-Hee YOUN  Changkyu CHOI  Jaywoo KIM  

     
    PAPER-Digital Signal Processing

      Vol:
    E88-A No:4
      Page(s):
    972-977

    In this paper, an adaptive microphone array system with a two-stage adaptation mode controller (AMC) is proposed for high-quality speech acquisition in real environments. The proposed system includes an adaptive array algorithm, a time-delay estimator and a newly proposed AMC. To ensure proper adaptation of the adaptive array algorithm, the proposed AMC uses not only temporal information, but also spatial information. The proposed AMC is constructed with two processing stages: an initialization stage and a running stage. In the initialization stage, a sound source localization technique is adopted, and a signal correlation characteristic is used in the running stage. For the adaptive array algorithm, a generalized sidelobe canceller with an adaptive blocking matrix is used. The proposed algorithm is implemented as a real-time man-machine interface module of a home-agent robot. Simulation results show 13 dB SINR improvement with the speaker sitting 2 m distance from the home-agent robot. The speech recognition rate is also enhanced by 32% when compared to the single channel acquisition system.

  • Tracking of Speaker Direction by Integrated Use of Microphone Pairs in Equilateral-Triangle

    Yusuke HIOKA  Nozomu HAMADA  

     
    PAPER

      Vol:
    E88-A No:3
      Page(s):
    633-641

    In this report, we propose a tracking algorithm of speaker direction using microphones located at vertices of an equilateral triangle. The method realizes tracking by minimizing a performance index that consists of the cross spectra at three different microphone pairs in the triangular array. We adopt the steepest descent method to minimize it, and for guaranteeing global convergence to the correct direction with high accuracy, we alter the performance index during the adaptation depending on the convergence state. Through some computer simulation and experiments in a real acoustic environment, we show the effectiveness of the proposed method.

  • Multiple Regression of Log Spectra for In-Car Speech Recognition Using Multiple Distributed Microphones

    Weifeng LI  Tetsuya SHINDE  Hiroshi FUJIMURA  Chiyomi MIYAJIMA  Takanori NISHINO  Katunobu ITOU  Kazuya TAKEDA  Fumitada ITAKURA  

     
    PAPER-Feature Extraction and Acoustic Medelings

      Vol:
    E88-D No:3
      Page(s):
    384-390

    This paper describes a new multi-channel method of noisy speech recognition, which estimates the log spectrum of speech at a close-talking microphone based on the multiple regression of the log spectra (MRLS) of noisy signals captured by distributed microphones. The advantages of the proposed method are as follows: 1) The method does not require a sensitive geometric layout, calibration of the sensors nor additional pre-processing for tracking the speech source; 2) System works in very small computation amounts; and 3) Regression weights can be statistically optimized over the given training data. Once the optimal regression weights are obtained by regression learning, they can be utilized to generate the estimated log spectrum in the recognition phase, where the speech of close-talking is no longer required. The performance of the proposed method is illustrated by speech recognition of real in-car dialogue data. In comparison to the nearest distant microphone and multi-microphone adaptive beamformer, the proposed approach obtains relative word error rate (WER) reductions of 9.8% and 3.6%, respectively.

  • Multistage SIMO-Model-Based Blind Source Separation Combining Frequency-Domain ICA and Time-Domain ICA

    Satoshi UKAI  Tomoya TAKATANI  Hiroshi SARUWATARI  Kiyohiro SHIKANO  Ryo MUKAI  Hiroshi SAWADA  

     
    PAPER

      Vol:
    E88-A No:3
      Page(s):
    642-650

    In this paper, single-input multiple-output (SIMO)-model-based blind source separation (BSS) is addressed, where unknown mixed source signals are detected at microphones, and can be separated, not into monaural source signals but into SIMO-model-based signals from independent sources as they are at the microphones. This technique is highly applicable to high-fidelity signal processing such as binaural signal processing. First, we provide an experimental comparison between two kinds of SIMO-model-based BSS methods, namely, conventional frequency-domain ICA with projection-back processing (FDICA-PB), and SIMO-ICA which was recently proposed by the authors. Secondly, we propose a new combination technique of the FDICA-PB and SIMO-ICA, which can achieve a higher separation performance than the two methods. The experimental results reveal that the accuracy of the separated SIMO signals in the simple SIMO-ICA is inferior to that of the signals obtained by FDICA-PB under low-quality initial value conditions, but the proposed combination technique can outperform both simple FDICA-PB and SIMO-ICA.

  • Iterative Estimation and Compensation of Signal Direction for Moving Sound Source by Mobile Microphone Array

    Toshiharu HORIUCHI  Mitsunori MIZUMACHI  Satoshi NAKAMURA  

     
    PAPER-Engineering Acoustics

      Vol:
    E87-A No:11
      Page(s):
    2950-2956

    This paper proposes a simple method for estimation and compensation of signal direction, to deal with relative change of sound source location caused by the movements of a microphone array and a sound source. This method introduces a delay filter that has shifted and sampled sinc functions. This paper presents a concept for the joint optimization of arrival time differences and of the coordinate system of a mobile microphone array. We use the LMS algorithm to derive this method by maintaining a certain relationship between the directions of the microphone array and the sound source directions. This method directly estimates the relative directions of the microphone array to the sound source directions by minimizing the relative differences of arrival time among the observed signals, not by estimating the time difference of arrival (TDOA) between two observed signals. This method also compensates the time delay of the observed signals simultaneously, and it has a feature to maintain that the output signals are in phase. Simulation results support effectiveness of the method.

  • High-Fidelity Blind Separation of Acoustic Signals Using SIMO-Model-Based Independent Component Analysis

    Tomoya TAKATANI  Tsuyoki NISHIKAWA  Hiroshi SARUWATARI  Kiyohiro SHIKANO  

     
    PAPER-Engineering Acoustics

      Vol:
    E87-A No:8
      Page(s):
    2063-2072

    We newly propose a novel blind separation framework for Single-Input Multiple-Output (SIMO)-model-based acoustic signals using an extended ICA algorithm, SIMO-ICA. The SIMO-ICA consists of multiple ICAs and a fidelity controller, and each ICA runs in parallel under the fidelity control of the entire separation system. The SIMO-ICA can separate the mixed signals, not into monaural source signals but into SIMO-model-based signals from independent sources as they are at the microphones. Thus, the separated signals of SIMO-ICA can maintain the spatial qualities of each sound source. In order to evaluate its effectiveness, separation experiments are carried out under both nonreverberant and reverberant conditions. The experimental results reveal that the signal separation performance of the proposed SIMO-ICA is the same as that of the conventional ICA-based method, and that the spatial quality of the separated sound in SIMO-ICA is remarkably superior to that of the conventional method, particularly for the fidelity of the sound reproduction.

  • Overdetermined Blind Separation for Real Convolutive Mixtures of Speech Based on Multistage ICA Using Subarray Processing

    Tsuyoki NISHIKAWA  Hiroshi ABE  Hiroshi SARUWATARI  Kiyohiro SHIKANO  Atsunobu KAMINUMA  

     
    PAPER-Speech/Acoustic Signal Processing

      Vol:
    E87-A No:8
      Page(s):
    1924-1932

    We propose a new algorithm for overdetermined blind source separation (BSS) based on multistage independent component analysis (MSICA). To improve the separation performance, we have proposed MSICA in which frequency-domain ICA and time-domain ICA are cascaded. In the original MSICA, the specific mixing model, where the number of microphones is equal to that of sources, was assumed. However, additional microphones are required to achieve an improved separation performance under reverberant environments. This leads to alternative problems, e.g., a complication of the permutation problem. In order to solve them, we propose a new extended MSICA using subarray processing, where the number of microphones and that of sources are set to be the same in every subarray. The experimental results obtained under the real environment reveal that the separation performance of the proposed MSICA is improved as the number of microphones is increased.

  • Estimation of Azimuth and Elevation DOA Using Microphones Located at Apices of Regular Tetrahedron

    Yusuke HIOKA  Nozomu HAMADA  

     
    LETTER-Speech/Acoustic Signal Processing

      Vol:
    E87-A No:8
      Page(s):
    2058-2062

    The proposed DOA (Direction Of Arrival) estimation method by integrating the frequency array data generated from microphone pairs in an equilateral-triangular microphone array is extended here. The method uses four microphones located at the apices of regular tetrahedron to enable to estimate the elevation angle from the array plane as well. Furthermore, we introduce an idea for separate estimation of azimuth and elevation to reduce the computational loads.

  • Microphone Array with Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator for Speech Enhancement

    Hongseok KWON  Jongmok SON  Keunsung BAE  

     
    LETTER

      Vol:
    E87-A No:6
      Page(s):
    1491-1494

    This paper describes a new speech enhancement system that employs a microphone array with post-processing based on minimum mean-square error short-time spectral amplitude (MMSE-STSA) estimator. To get more accurate MMSE-STSA estimator in a microphone array, modification and refinement procedure are carried out from each microphone output. Performance of the proposed system is compared with that of other methods using a microphone array. Noise removal experiments for white and pink noises demonstrate the superiority of the proposed speech enhancement system to others with a microphone array in average output SNRs and cepstral distance measures.

  • Sound Source Localization Using a Profile Fitting Method with Sound Reflectors

    Osamu ICHIKAWA  Tetsuya TAKIGUCHI  Masafumi NISHIMURA  

     
    PAPER

      Vol:
    E87-D No:5
      Page(s):
    1138-1145

    In a two-microphone approach, interchannel differences in time (ICTD) and interchannel differences in sound level (ICLD) have generally been used for sound source localization. But those cues are not effective for vertical localization in the median plane (direct front). For that purpose, spectral cues based on features of head-related transfer functions (HRTF) have been investigated, but they are not robust enough against signal variations and environmental noise. In this paper, we use a "profile" as a cue while using a combination of reflectors specially designed for vertical localization. The observed sound is converted into a profile containing information about reflections as well as ICTD and ICLD data. The observed profile is decomposed into signal and noise by using template profiles associated with sound source locations. The template minimizing the residual of the decomposition gives the estimated sound source location. Experiments show this method can correctly provide a rough estimate of the vertical location even in a noisy environment.

  • DOA Estimation of Speech Signal Using Microphones Located at Vertices of Equilateral Triangle

    Yusuke HIOKA  Nozomu HAMADA  

     
    PAPER-Audio/Speech Coding

      Vol:
    E87-A No:3
      Page(s):
    559-566

    In this paper, we propose a DOA (Direction Of Arrival) estimation method of speech signal using three microphones. The angular resolution of the method is almost uniform with respect to DOA. Our previous DOA estimation method using the frequency-domain array data for a pair of microphones achieves high precision estimation. However, its resolution degrades as the propagating direction being apart from the array broadside. In the method presented here, we utilize three microphones located at vertices of equilateral triangle and integrate the frequency-domain array data for three pairs of microphones. For the estimation scheme, the subspace analysis for the integrated frequency array data is proposed. Through both computer simulations and experiments in a real acoustical environment, we show the efficiency of the proposed method.

  • Probability Distribution of Time-Series of Speech Spectral Components

    Rajkishore PRASAD  Hiroshi SARUWATARI  Kiyohiro SHIKANO  

     
    PAPER-Audio/Speech Coding

      Vol:
    E87-A No:3
      Page(s):
    584-597

    This paper deals with the statistical modeling of a Time-Frequency Series of Speech (TFSS), obtained by Short-Time Fourier Transform (STFT) analysis of the speech signal picked up by a linear microphone array with two elements. We have attempted to find closer match between the distribution of the TFSS and theoretical distributions like Laplacian Distribution (LD), Gaussian Distribution (GD) and Generalized Gaussian Distribution (GGD) with parameters estimated from the TFSS data. It has been found that GGD provides the best models for real part, imaginary part and polar magnitudes of the time-series of the spectral components. The distribution of the polar magnitude is closer to LD than that of the real and imaginary parts. The distributions of the real and imaginary parts of TFSS correspond to strongly LD. The phase of the TFSS has been found uniformly distributed. The use of GGD based model as PDF in the fixed-point Frequency Domain Independent Component Analysis (FDICA) provides better separation performance and improves convergence speed significantly.

  • Measurement of Early Reflections in a Room with Five Microphone System

    Chulmin CHOI  Lae-Hoon KIM  Yangki OH  Sejin DOO  Koeng-Mo SUNG  

     
    LETTER-Engineering Acoustics

      Vol:
    E86-A No:12
      Page(s):
    3283-3287

    The measurement of the 3-dimensional behavior of early reflections in a sound field has been an important issue in auditorium acoustics since the reflection profile has been found to be strongly correlated with the subjective responsiveness of a listener. In order to detect the incidence angle and relative amplitude of reflections, a 4-point microphone system has conventionally been used. A new measurement system is proposed in this paper, which has 5 microphones. Microphones are located on each four apex of a tetrahedron and at the center of gravity. Early reflections, including simultaneously incident reflections,which previous 4-point microphone system could not discriminate as individual wavefronts, were successfully found with the new system. In order to calculate accurate image source positions, it is necessary to determine the exact peak positions from measured impulse responses composed of highly deformed and overlapped impulse trains. For this purpose, a peak-detecting algorithm, which finds dominant peaks in the impulse response by an iteration method, is introduced. In this paper, the theoretical background and features of the 5-microphone system are described. Also, some results of experiments using this system are described.

  • Voice Activity Detection with Array Signal Processing in the Wavelet Domain

    Yusuke HIOKA  Nozomu HAMADA  

     
    PAPER-Engineering Acoustics

      Vol:
    E86-A No:11
      Page(s):
    2802-2811

    In speech enhancement with adaptive microphone array, the voice activity detection (VAD) is indispensable for the adaptation control. Even though many VAD methods have been proposed as a pre-processor for speech recognition and compression, they can hardly discriminate nonstationary interferences which frequently exist in real environment. In this research, we propose a novel VAD method with array signal processing in the wavelet domain. In that domain we can integrate the temporal, spectral and spatial information to achieve robust voice activity discriminability for a nonstationary interference arriving from close direction of speech. The signals acquired by microphone array are at first decomposed into appropriate subbands using wavelet packet to extract its temporal and spectral features. Then directionality check and direction estimation on each subbands are executed to do VAD with respect to the spatial information. Computer simulation results for sound data demonstrate that the proposed method keeps its discriminability even for the interference arriving from close direction of speech.

  • Stable Learning Algorithm for Blind Separation of Temporally Correlated Acoustic Signals Combining Multistage ICA and Linear Prediction

    Tsuyoki NISHIKAWA  Hiroshi SARUWATARI  Kiyohiro SHIKANO  

     
    PAPER

      Vol:
    E86-A No:8
      Page(s):
    2028-2036

    We newly propose a stable algorithm for blind source separation (BSS) combining multistage ICA (MSICA) and linear prediction. The MSICA is the method previously proposed by the authors, in which frequency-domain ICA (FDICA) for a rough separation is followed by time-domain ICA (TDICA) to remove residual crosstalk. For temporally correlated signals, we must use TDICA with a nonholonomic constraint to avoid the decorrelation effect from the holonomic constraint. However, the stability cannot be guaranteed in the nonholonomic case. To solve the problem, the linear predictors estimated from the roughly separated signals by FDICA are inserted before the holonomic TDICA as a prewhitening processing, and the dewhitening is performed after TDICA. The stability of the proposed algorithm can be guaranteed by the holonomic constraint, and the pre/dewhitening processing prevents the decorrelation. The experiments in a reverberant room reveal that the algorithm results in higher stability and separation performance.

  • Speaker Tracking for Hands-Free Continuous Speech Recognition in Noise Based on a Spectrum-Entropy Beamforming Method

    George NOKAS  Evangelos DERMATAS  

     
    LETTER-Speech and Hearing

      Vol:
    E86-D No:4
      Page(s):
    755-758

    In this paper, we present a novel beam-former capable of tracking a rapidly moving speaker in a very noisy environment. The localization algorithm extracts a set of candidate direction-of-arrival (DOA) for the signal sources using array signal processing methods in the frequency domain. A minimum variance (MV) beam-former identifies the speech signal DOA in the direction where the signal's spectrum entropy is minimized. A fine tuning process detects the MV direction which is closest to the initial estimation using a smaller analysis window. Extended experiments, carried out in the range of 20-0 dB SNR, show significant improvement in the recognition rate of a moving speaker especially in very low SNRs (from 11.11% to 43.79% at 0 dB SNR in anechoic environment and from 9.9% to 30.51% in reverberant environment).

  • Blind Source Separation of Acoustic Signals Based on Multistage ICA Combining Frequency-Domain ICA and Time-Domain ICA

    Tsuyoki NISHIKAWA  Hiroshi SARUWATARI  Kiyohiro SHIKANO  

     
    PAPER-Digital Signal Processing

      Vol:
    E86-A No:4
      Page(s):
    846-858

    We propose a new algorithm for blind source separation (BSS), in which frequency-domain independent component analysis (FDICA) and time-domain ICA (TDICA) are combined to achieve a superior source-separation performance under reverberant conditions. Generally speaking, conventional TDICA fails to separate source signals under heavily reverberant conditions because of the low convergence in the iterative learning of the inverse of the mixing system. On the other hand, the separation performance of conventional FDICA also degrades significantly because the independence assumption of narrow-band signals collapses when the number of subbands increases. In the proposed method, the separated signals of FDICA are regarded as the input signals for TDICA, and we can remove the residual crosstalk components of FDICA by using TDICA. The experimental results obtained under the reverberant condition reveal that the separation performance of the proposed method is superior to those of TDICA- and FDICA-based BSS methods.

  • Fast-Convergence Algorithm for Blind Source Separation Based on Array Signal Processing

    Hiroshi SARUWATARI  Toshiya KAWAMURA  Tsuyoki NISHIKAWA  Kiyohiro SHIKANO  

     
    LETTER-Convolutive Systems

      Vol:
    E86-A No:3
      Page(s):
    634-639

    We propose a new algorithm for blind source separation (BSS), in which independent component analysis (ICA) and beamforming are combined to resolve the low-convergence problem through optimization in ICA. The proposed method consists of the following two parts: frequency-domain ICA with direction-of-arrival (DOA) estimation, and null beamforming based on the estimated DOA. The alternation of learning between ICA and beamforming can realize fast- and high-convergence optimization. The results of the signal separation experiments reveal that the signal separation performance of the proposed algorithm is superior to that of the conventional ICA-based BSS method.

41-60hit(72hit)