Nonnegative matrix factorization (NMF) is one of the most popular machine learning tools for speech enhancement. The supervised NMF-based speech enhancement is accomplished by updating iteratively with the prior knowledge of the clean speech and noise spectra bases. However, in many real-world scenarios, it is not always possible for conducting any prior training. The traditional semi-supervised NMF (SNMF) version overcomes this shortcoming while the performance degrades. In this letter, without any prior knowledge of the speech and noise, we present an improved semi-supervised NMF-based speech enhancement algorithm combining techniques of NMF and robust principal component analysis (RPCA). In this approach, fixed speech bases are obtained from the training samples chosen from public dateset offline. The noise samples used for noise bases training, instead of characterizing a priori as usual, can be obtained via RPCA algorithm on the fly. This letter also conducts a study on the assumption whether the time length of the estimated noise samples may have an effect on the performance of the algorithm. Three metrics, including PESQ, SDR and SNR are applied to evaluate the performance of the algorithms by making experiments on TIMIT with 20 noise types at various signal-to-noise ratio levels. Extensive experimental results demonstrate the superiority of the proposed algorithm over the competing speech enhancement algorithm.
Yonggang HU
Australian National University
Xiongwei ZHANG
PLA University of Science and Technology
Xia ZOU
PLA University of Science and Technology
Meng SUN
PLA University of Science and Technology
Yunfei ZHENG
PLA University of Science and Technology,the Army Officer Academy of PLA and Key Laboratory of Polarization Imaging Detection Technology
Gang MIN
XI'AN Communications Institute
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Yonggang HU, Xiongwei ZHANG, Xia ZOU, Meng SUN, Yunfei ZHENG, Gang MIN, "Semi-Supervised Speech Enhancement Combining Nonnegative Matrix Factorization and Robust Principal Component Analysis" in IEICE TRANSACTIONS on Fundamentals,
vol. E100-A, no. 8, pp. 1714-1719, August 2017, doi: 10.1587/transfun.E100.A.1714.
Abstract: Nonnegative matrix factorization (NMF) is one of the most popular machine learning tools for speech enhancement. The supervised NMF-based speech enhancement is accomplished by updating iteratively with the prior knowledge of the clean speech and noise spectra bases. However, in many real-world scenarios, it is not always possible for conducting any prior training. The traditional semi-supervised NMF (SNMF) version overcomes this shortcoming while the performance degrades. In this letter, without any prior knowledge of the speech and noise, we present an improved semi-supervised NMF-based speech enhancement algorithm combining techniques of NMF and robust principal component analysis (RPCA). In this approach, fixed speech bases are obtained from the training samples chosen from public dateset offline. The noise samples used for noise bases training, instead of characterizing a priori as usual, can be obtained via RPCA algorithm on the fly. This letter also conducts a study on the assumption whether the time length of the estimated noise samples may have an effect on the performance of the algorithm. Three metrics, including PESQ, SDR and SNR are applied to evaluate the performance of the algorithms by making experiments on TIMIT with 20 noise types at various signal-to-noise ratio levels. Extensive experimental results demonstrate the superiority of the proposed algorithm over the competing speech enhancement algorithm.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/transfun.E100.A.1714/_p
Copy
@ARTICLE{e100-a_8_1714,
author={Yonggang HU, Xiongwei ZHANG, Xia ZOU, Meng SUN, Yunfei ZHENG, Gang MIN, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Semi-Supervised Speech Enhancement Combining Nonnegative Matrix Factorization and Robust Principal Component Analysis},
year={2017},
volume={E100-A},
number={8},
pages={1714-1719},
abstract={Nonnegative matrix factorization (NMF) is one of the most popular machine learning tools for speech enhancement. The supervised NMF-based speech enhancement is accomplished by updating iteratively with the prior knowledge of the clean speech and noise spectra bases. However, in many real-world scenarios, it is not always possible for conducting any prior training. The traditional semi-supervised NMF (SNMF) version overcomes this shortcoming while the performance degrades. In this letter, without any prior knowledge of the speech and noise, we present an improved semi-supervised NMF-based speech enhancement algorithm combining techniques of NMF and robust principal component analysis (RPCA). In this approach, fixed speech bases are obtained from the training samples chosen from public dateset offline. The noise samples used for noise bases training, instead of characterizing a priori as usual, can be obtained via RPCA algorithm on the fly. This letter also conducts a study on the assumption whether the time length of the estimated noise samples may have an effect on the performance of the algorithm. Three metrics, including PESQ, SDR and SNR are applied to evaluate the performance of the algorithms by making experiments on TIMIT with 20 noise types at various signal-to-noise ratio levels. Extensive experimental results demonstrate the superiority of the proposed algorithm over the competing speech enhancement algorithm.},
keywords={},
doi={10.1587/transfun.E100.A.1714},
ISSN={1745-1337},
month={August},}
Copy
TY - JOUR
TI - Semi-Supervised Speech Enhancement Combining Nonnegative Matrix Factorization and Robust Principal Component Analysis
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 1714
EP - 1719
AU - Yonggang HU
AU - Xiongwei ZHANG
AU - Xia ZOU
AU - Meng SUN
AU - Yunfei ZHENG
AU - Gang MIN
PY - 2017
DO - 10.1587/transfun.E100.A.1714
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E100-A
IS - 8
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - August 2017
AB - Nonnegative matrix factorization (NMF) is one of the most popular machine learning tools for speech enhancement. The supervised NMF-based speech enhancement is accomplished by updating iteratively with the prior knowledge of the clean speech and noise spectra bases. However, in many real-world scenarios, it is not always possible for conducting any prior training. The traditional semi-supervised NMF (SNMF) version overcomes this shortcoming while the performance degrades. In this letter, without any prior knowledge of the speech and noise, we present an improved semi-supervised NMF-based speech enhancement algorithm combining techniques of NMF and robust principal component analysis (RPCA). In this approach, fixed speech bases are obtained from the training samples chosen from public dateset offline. The noise samples used for noise bases training, instead of characterizing a priori as usual, can be obtained via RPCA algorithm on the fly. This letter also conducts a study on the assumption whether the time length of the estimated noise samples may have an effect on the performance of the algorithm. Three metrics, including PESQ, SDR and SNR are applied to evaluate the performance of the algorithms by making experiments on TIMIT with 20 noise types at various signal-to-noise ratio levels. Extensive experimental results demonstrate the superiority of the proposed algorithm over the competing speech enhancement algorithm.
ER -