This paper presents a deep learning-based non-intrusive speech intelligibility estimation method using bottleneck features of autoencoder. The conventional standard non-intrusive speech intelligibility estimation method, P.563, lacks intelligibility estimation performance in various noise environments. We propose a more accurate speech intelligibility estimation method based on long-short term memory (LSTM) neural network whose input and output are an autoencoder bottleneck features and a short-time objective intelligence (STOI) score, respectively, where STOI is a standard tool for measuring intrusive speech intelligibility with reference speech signals. We showed that the proposed method has a superior performance by comparing with the conventional standard P.563 and mel-frequency cepstral coefficient (MFCC) feature-based intelligibility estimation methods for speech signals in various noise environments.
Yoonhee KIM
Seoul National University of Science and Technology
Deokgyu YUN
Seoul National University of Science and Technology
Hannah LEE
Seoul National University of Science and Technology
Seung Ho CHOI
Seoul National University of Science and Technology
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Yoonhee KIM, Deokgyu YUN, Hannah LEE, Seung Ho CHOI, "A Non-Intrusive Speech Intelligibility Estimation Method Based on Deep Learning Using Autoencoder Features" in IEICE TRANSACTIONS on Information,
vol. E103-D, no. 3, pp. 714-715, March 2020, doi: 10.1587/transinf.2019EDL8150.
Abstract: This paper presents a deep learning-based non-intrusive speech intelligibility estimation method using bottleneck features of autoencoder. The conventional standard non-intrusive speech intelligibility estimation method, P.563, lacks intelligibility estimation performance in various noise environments. We propose a more accurate speech intelligibility estimation method based on long-short term memory (LSTM) neural network whose input and output are an autoencoder bottleneck features and a short-time objective intelligence (STOI) score, respectively, where STOI is a standard tool for measuring intrusive speech intelligibility with reference speech signals. We showed that the proposed method has a superior performance by comparing with the conventional standard P.563 and mel-frequency cepstral coefficient (MFCC) feature-based intelligibility estimation methods for speech signals in various noise environments.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2019EDL8150/_p
Copy
@ARTICLE{e103-d_3_714,
author={Yoonhee KIM, Deokgyu YUN, Hannah LEE, Seung Ho CHOI, },
journal={IEICE TRANSACTIONS on Information},
title={A Non-Intrusive Speech Intelligibility Estimation Method Based on Deep Learning Using Autoencoder Features},
year={2020},
volume={E103-D},
number={3},
pages={714-715},
abstract={This paper presents a deep learning-based non-intrusive speech intelligibility estimation method using bottleneck features of autoencoder. The conventional standard non-intrusive speech intelligibility estimation method, P.563, lacks intelligibility estimation performance in various noise environments. We propose a more accurate speech intelligibility estimation method based on long-short term memory (LSTM) neural network whose input and output are an autoencoder bottleneck features and a short-time objective intelligence (STOI) score, respectively, where STOI is a standard tool for measuring intrusive speech intelligibility with reference speech signals. We showed that the proposed method has a superior performance by comparing with the conventional standard P.563 and mel-frequency cepstral coefficient (MFCC) feature-based intelligibility estimation methods for speech signals in various noise environments.},
keywords={},
doi={10.1587/transinf.2019EDL8150},
ISSN={1745-1361},
month={March},}
Copy
TY - JOUR
TI - A Non-Intrusive Speech Intelligibility Estimation Method Based on Deep Learning Using Autoencoder Features
T2 - IEICE TRANSACTIONS on Information
SP - 714
EP - 715
AU - Yoonhee KIM
AU - Deokgyu YUN
AU - Hannah LEE
AU - Seung Ho CHOI
PY - 2020
DO - 10.1587/transinf.2019EDL8150
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2020
AB - This paper presents a deep learning-based non-intrusive speech intelligibility estimation method using bottleneck features of autoencoder. The conventional standard non-intrusive speech intelligibility estimation method, P.563, lacks intelligibility estimation performance in various noise environments. We propose a more accurate speech intelligibility estimation method based on long-short term memory (LSTM) neural network whose input and output are an autoencoder bottleneck features and a short-time objective intelligence (STOI) score, respectively, where STOI is a standard tool for measuring intrusive speech intelligibility with reference speech signals. We showed that the proposed method has a superior performance by comparing with the conventional standard P.563 and mel-frequency cepstral coefficient (MFCC) feature-based intelligibility estimation methods for speech signals in various noise environments.
ER -