In this paper, we present a novel beam-former capable of tracking a rapidly moving speaker in a very noisy environment. The localization algorithm extracts a set of candidate direction-of-arrival (DOA) for the signal sources using array signal processing methods in the frequency domain. A minimum variance (MV) beam-former identifies the speech signal DOA in the direction where the signal's spectrum entropy is minimized. A fine tuning process detects the MV direction which is closest to the initial estimation using a smaller analysis window. Extended experiments, carried out in the range of 20-0 dB SNR, show significant improvement in the recognition rate of a moving speaker especially in very low SNRs (from 11.11% to 43.79% at 0 dB SNR in anechoic environment and from 9.9% to 30.51% in reverberant environment).
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
George NOKAS, Evangelos DERMATAS, "Speaker Tracking for Hands-Free Continuous Speech Recognition in Noise Based on a Spectrum-Entropy Beamforming Method" in IEICE TRANSACTIONS on Information,
vol. E86-D, no. 4, pp. 755-758, April 2003, doi: .
Abstract: In this paper, we present a novel beam-former capable of tracking a rapidly moving speaker in a very noisy environment. The localization algorithm extracts a set of candidate direction-of-arrival (DOA) for the signal sources using array signal processing methods in the frequency domain. A minimum variance (MV) beam-former identifies the speech signal DOA in the direction where the signal's spectrum entropy is minimized. A fine tuning process detects the MV direction which is closest to the initial estimation using a smaller analysis window. Extended experiments, carried out in the range of 20-0 dB SNR, show significant improvement in the recognition rate of a moving speaker especially in very low SNRs (from 11.11% to 43.79% at 0 dB SNR in anechoic environment and from 9.9% to 30.51% in reverberant environment).
URL: https://global.ieice.org/en_transactions/information/10.1587/e86-d_4_755/_p
Copy
@ARTICLE{e86-d_4_755,
author={George NOKAS, Evangelos DERMATAS, },
journal={IEICE TRANSACTIONS on Information},
title={Speaker Tracking for Hands-Free Continuous Speech Recognition in Noise Based on a Spectrum-Entropy Beamforming Method},
year={2003},
volume={E86-D},
number={4},
pages={755-758},
abstract={In this paper, we present a novel beam-former capable of tracking a rapidly moving speaker in a very noisy environment. The localization algorithm extracts a set of candidate direction-of-arrival (DOA) for the signal sources using array signal processing methods in the frequency domain. A minimum variance (MV) beam-former identifies the speech signal DOA in the direction where the signal's spectrum entropy is minimized. A fine tuning process detects the MV direction which is closest to the initial estimation using a smaller analysis window. Extended experiments, carried out in the range of 20-0 dB SNR, show significant improvement in the recognition rate of a moving speaker especially in very low SNRs (from 11.11% to 43.79% at 0 dB SNR in anechoic environment and from 9.9% to 30.51% in reverberant environment).},
keywords={},
doi={},
ISSN={},
month={April},}
Copy
TY - JOUR
TI - Speaker Tracking for Hands-Free Continuous Speech Recognition in Noise Based on a Spectrum-Entropy Beamforming Method
T2 - IEICE TRANSACTIONS on Information
SP - 755
EP - 758
AU - George NOKAS
AU - Evangelos DERMATAS
PY - 2003
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E86-D
IS - 4
JA - IEICE TRANSACTIONS on Information
Y1 - April 2003
AB - In this paper, we present a novel beam-former capable of tracking a rapidly moving speaker in a very noisy environment. The localization algorithm extracts a set of candidate direction-of-arrival (DOA) for the signal sources using array signal processing methods in the frequency domain. A minimum variance (MV) beam-former identifies the speech signal DOA in the direction where the signal's spectrum entropy is minimized. A fine tuning process detects the MV direction which is closest to the initial estimation using a smaller analysis window. Extended experiments, carried out in the range of 20-0 dB SNR, show significant improvement in the recognition rate of a moving speaker especially in very low SNRs (from 11.11% to 43.79% at 0 dB SNR in anechoic environment and from 9.9% to 30.51% in reverberant environment).
ER -