A Non-Intrusive Speech Quality Evaluation Method Based on the Audiogram and Weighted Frequency Information for Hearing Aid

Ruxue GUO; Pengxu JIANG; Ruiyu LIANG; Yue XIE; Cairong ZOU

doi:10.1587/transfun.2022EAL2040

IEICE TRANSACTIONS on Fundamentals

A Non-Intrusive Speech Quality Evaluation Method Based on the Audiogram and Weighted Frequency Information for Hearing Aid

Ruxue GUO, Pengxu JIANG, Ruiyu LIANG, Yue XIE, Cairong ZOU

Full Text Views

0

Cite this

Summary :

For a long time, the compensation effect of hearing aid is mainly evaluated subjectively, and there are fewer studies of objective evaluation. Furthermore, a pure speech signal is generally required as a reference in the existing objective evaluation methods, which restricts the practicality in a real-world environment. Therefore, this paper presents a non-intrusive speech quality evaluation method for hearing aid, which combines the audiogram and weighted frequency information. The proposed model mainly includes an audiogram information extraction network, a frequency information extraction network, and a quality score mapping network. The audiogram is the input of the audiogram information extraction network, which helps the system capture the information related to hearing loss. In addition, the low-frequency bands of speech contain loudness information and the medium and high-frequency components contribute to semantic comprehension. The information of two frequency bands is input to the frequency information extraction network to obtain time-frequency information. When obtaining the high-level features of different frequency bands and audiograms, they are fused into two groups of tensors that distinguish the information of different frequency bands and used as the input of the attention layer to calculate the corresponding weight distribution. Finally, a dense layer is employed to predict the score of speech quality. The experimental results show that it is reasonable to combine the audiogram and the weight of the information from two frequency bands, which can effectively realize the evaluation of the speech quality of the hearing aid.

Publication: IEICE TRANSACTIONS on Fundamentals Vol.E106-A No.1 pp.64-68

Publication Date: 2023/01/01

Publicized: 2022/07/25

Online ISSN: 1745-1337

DOI: 10.1587/transfun.2022EAL2040

Type of Manuscript: LETTER

Category: Speech and Hearing

Authors

Ruxue GUO
  Southeast University
Pengxu JIANG
  Southeast University
Ruiyu LIANG
  Nanjing Institute of Technology
Yue XIE
  Nanjing Institute of Technology
Cairong ZOU
  Southeast University

Keyword

speech quality evaluation, hearing aid, audiogram, long short-term memory, deep neural networks

Cite this

Copy

Ruxue GUO, Pengxu JIANG, Ruiyu LIANG, Yue XIE, Cairong ZOU, "A Non-Intrusive Speech Quality Evaluation Method Based on the Audiogram and Weighted Frequency Information for Hearing Aid" in IEICE TRANSACTIONS on Fundamentals, vol. E106-A, no. 1, pp. 64-68, January 2023, doi: 10.1587/transfun.2022EAL2040.
Abstract: For a long time, the compensation effect of hearing aid is mainly evaluated subjectively, and there are fewer studies of objective evaluation. Furthermore, a pure speech signal is generally required as a reference in the existing objective evaluation methods, which restricts the practicality in a real-world environment. Therefore, this paper presents a non-intrusive speech quality evaluation method for hearing aid, which combines the audiogram and weighted frequency information. The proposed model mainly includes an audiogram information extraction network, a frequency information extraction network, and a quality score mapping network. The audiogram is the input of the audiogram information extraction network, which helps the system capture the information related to hearing loss. In addition, the low-frequency bands of speech contain loudness information and the medium and high-frequency components contribute to semantic comprehension. The information of two frequency bands is input to the frequency information extraction network to obtain time-frequency information. When obtaining the high-level features of different frequency bands and audiograms, they are fused into two groups of tensors that distinguish the information of different frequency bands and used as the input of the attention layer to calculate the corresponding weight distribution. Finally, a dense layer is employed to predict the score of speech quality. The experimental results show that it is reasonable to combine the audiogram and the weight of the information from two frequency bands, which can effectively realize the evaluation of the speech quality of the hearing aid.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/transfun.2022EAL2040/_p

Copy

@ARTICLE{e106-a_1_64,
author={Ruxue GUO, Pengxu JIANG, Ruiyu LIANG, Yue XIE, Cairong ZOU, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={A Non-Intrusive Speech Quality Evaluation Method Based on the Audiogram and Weighted Frequency Information for Hearing Aid},
year={2023},
volume={E106-A},
number={1},
pages={64-68},
abstract={For a long time, the compensation effect of hearing aid is mainly evaluated subjectively, and there are fewer studies of objective evaluation. Furthermore, a pure speech signal is generally required as a reference in the existing objective evaluation methods, which restricts the practicality in a real-world environment. Therefore, this paper presents a non-intrusive speech quality evaluation method for hearing aid, which combines the audiogram and weighted frequency information. The proposed model mainly includes an audiogram information extraction network, a frequency information extraction network, and a quality score mapping network. The audiogram is the input of the audiogram information extraction network, which helps the system capture the information related to hearing loss. In addition, the low-frequency bands of speech contain loudness information and the medium and high-frequency components contribute to semantic comprehension. The information of two frequency bands is input to the frequency information extraction network to obtain time-frequency information. When obtaining the high-level features of different frequency bands and audiograms, they are fused into two groups of tensors that distinguish the information of different frequency bands and used as the input of the attention layer to calculate the corresponding weight distribution. Finally, a dense layer is employed to predict the score of speech quality. The experimental results show that it is reasonable to combine the audiogram and the weight of the information from two frequency bands, which can effectively realize the evaluation of the speech quality of the hearing aid.},
keywords={},
doi={10.1587/transfun.2022EAL2040},
ISSN={1745-1337},
month={January},}

Copy

TY - JOUR
TI - A Non-Intrusive Speech Quality Evaluation Method Based on the Audiogram and Weighted Frequency Information for Hearing Aid
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 64
EP - 68
AU - Ruxue GUO
AU - Pengxu JIANG
AU - Ruiyu LIANG
AU - Yue XIE
AU - Cairong ZOU
PY - 2023
DO - 10.1587/transfun.2022EAL2040
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E106-A
IS - 1
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - January 2023
AB - For a long time, the compensation effect of hearing aid is mainly evaluated subjectively, and there are fewer studies of objective evaluation. Furthermore, a pure speech signal is generally required as a reference in the existing objective evaluation methods, which restricts the practicality in a real-world environment. Therefore, this paper presents a non-intrusive speech quality evaluation method for hearing aid, which combines the audiogram and weighted frequency information. The proposed model mainly includes an audiogram information extraction network, a frequency information extraction network, and a quality score mapping network. The audiogram is the input of the audiogram information extraction network, which helps the system capture the information related to hearing loss. In addition, the low-frequency bands of speech contain loudness information and the medium and high-frequency components contribute to semantic comprehension. The information of two frequency bands is input to the frequency information extraction network to obtain time-frequency information. When obtaining the high-level features of different frequency bands and audiograms, they are fused into two groups of tensors that distinguish the information of different frequency bands and used as the input of the attention layer to calculate the corresponding weight distribution. Finally, a dense layer is employed to predict the score of speech quality. The experimental results show that it is reasonable to combine the audiogram and the weight of the information from two frequency bands, which can effectively realize the evaluation of the speech quality of the hearing aid.
ER -

IEICE TRANSACTIONS on Fundamentals