A Non-Intrusive Speech Intelligibility Estimation Method Based on Deep Learning Using Autoencoder Features

Yoonhee KIM; Deokgyu YUN; Hannah LEE; Seung Ho CHOI

doi:10.1587/transinf.2019EDL8150

IEICE TRANSACTIONS on Information

A Non-Intrusive Speech Intelligibility Estimation Method Based on Deep Learning Using Autoencoder Features

Yoonhee KIM, Deokgyu YUN, Hannah LEE, Seung Ho CHOI

Full Text Views

0

Cite this

Summary :

This paper presents a deep learning-based non-intrusive speech intelligibility estimation method using bottleneck features of autoencoder. The conventional standard non-intrusive speech intelligibility estimation method, P.563, lacks intelligibility estimation performance in various noise environments. We propose a more accurate speech intelligibility estimation method based on long-short term memory (LSTM) neural network whose input and output are an autoencoder bottleneck features and a short-time objective intelligence (STOI) score, respectively, where STOI is a standard tool for measuring intrusive speech intelligibility with reference speech signals. We showed that the proposed method has a superior performance by comparing with the conventional standard P.563 and mel-frequency cepstral coefficient (MFCC) feature-based intelligibility estimation methods for speech signals in various noise environments.

Publication: IEICE TRANSACTIONS on Information Vol.E103-D No.3 pp.714-715

Publication Date: 2020/03/01

Publicized: 2019/12/11

Online ISSN: 1745-1361

DOI: 10.1587/transinf.2019EDL8150

Type of Manuscript: LETTER

Category: Speech and Hearing

Authors

Yoonhee KIM
  Seoul National University of Science and Technology
Deokgyu YUN
  Seoul National University of Science and Technology
Hannah LEE
  Seoul National University of Science and Technology
Seung Ho CHOI
  Seoul National University of Science and Technology

Keyword

autoencoder, bottleneck feature, STOI, deep learning, long short-term memory (LSTM)

Cite this

Copy

Yoonhee KIM, Deokgyu YUN, Hannah LEE, Seung Ho CHOI, "A Non-Intrusive Speech Intelligibility Estimation Method Based on Deep Learning Using Autoencoder Features" in IEICE TRANSACTIONS on Information, vol. E103-D, no. 3, pp. 714-715, March 2020, doi: 10.1587/transinf.2019EDL8150.
Abstract: This paper presents a deep learning-based non-intrusive speech intelligibility estimation method using bottleneck features of autoencoder. The conventional standard non-intrusive speech intelligibility estimation method, P.563, lacks intelligibility estimation performance in various noise environments. We propose a more accurate speech intelligibility estimation method based on long-short term memory (LSTM) neural network whose input and output are an autoencoder bottleneck features and a short-time objective intelligence (STOI) score, respectively, where STOI is a standard tool for measuring intrusive speech intelligibility with reference speech signals. We showed that the proposed method has a superior performance by comparing with the conventional standard P.563 and mel-frequency cepstral coefficient (MFCC) feature-based intelligibility estimation methods for speech signals in various noise environments.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2019EDL8150/_p

Copy

@ARTICLE{e103-d_3_714,
author={Yoonhee KIM, Deokgyu YUN, Hannah LEE, Seung Ho CHOI, },
journal={IEICE TRANSACTIONS on Information},
title={A Non-Intrusive Speech Intelligibility Estimation Method Based on Deep Learning Using Autoencoder Features},
year={2020},
volume={E103-D},
number={3},
pages={714-715},
abstract={This paper presents a deep learning-based non-intrusive speech intelligibility estimation method using bottleneck features of autoencoder. The conventional standard non-intrusive speech intelligibility estimation method, P.563, lacks intelligibility estimation performance in various noise environments. We propose a more accurate speech intelligibility estimation method based on long-short term memory (LSTM) neural network whose input and output are an autoencoder bottleneck features and a short-time objective intelligence (STOI) score, respectively, where STOI is a standard tool for measuring intrusive speech intelligibility with reference speech signals. We showed that the proposed method has a superior performance by comparing with the conventional standard P.563 and mel-frequency cepstral coefficient (MFCC) feature-based intelligibility estimation methods for speech signals in various noise environments.},
keywords={},
doi={10.1587/transinf.2019EDL8150},
ISSN={1745-1361},
month={March},}

Copy

TY - JOUR
TI - A Non-Intrusive Speech Intelligibility Estimation Method Based on Deep Learning Using Autoencoder Features
T2 - IEICE TRANSACTIONS on Information
SP - 714
EP - 715
AU - Yoonhee KIM
AU - Deokgyu YUN
AU - Hannah LEE
AU - Seung Ho CHOI
PY - 2020
DO - 10.1587/transinf.2019EDL8150
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2020
AB - This paper presents a deep learning-based non-intrusive speech intelligibility estimation method using bottleneck features of autoencoder. The conventional standard non-intrusive speech intelligibility estimation method, P.563, lacks intelligibility estimation performance in various noise environments. We propose a more accurate speech intelligibility estimation method based on long-short term memory (LSTM) neural network whose input and output are an autoencoder bottleneck features and a short-time objective intelligence (STOI) score, respectively, where STOI is a standard tool for measuring intrusive speech intelligibility with reference speech signals. We showed that the proposed method has a superior performance by comparing with the conventional standard P.563 and mel-frequency cepstral coefficient (MFCC) feature-based intelligibility estimation methods for speech signals in various noise environments.
ER -

IEICE TRANSACTIONS on Information