We propose a deep learning-based model for classifying pathological voices using a convolutional neural network and a feedforward neural network. The model uses combinations of heterogeneous parameters, including mel-frequency cepstral coefficients, linear predictive cepstral coefficients and higher-order statistics. We validate the accuracy of this model using the Massachusetts Eye and Ear Infirmary (MEEI) voice disorder database and the Saarbruecken Voice Database (SVD). Our model achieved an accuracy of 99.3% for MEEI and 75.18% for SVD. This model achieved an accuracy that is 7.18% higher than that of competitive models in previous studies.
JiYeoun LEE
Jungwon University
Hee-Jin CHOI
KAIST
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
JiYeoun LEE, Hee-Jin CHOI, "Deep Learning Approaches for Pathological Voice Detection Using Heterogeneous Parameters" in IEICE TRANSACTIONS on Information,
vol. E103-D, no. 8, pp. 1920-1923, August 2020, doi: 10.1587/transinf.2020EDL8031.
Abstract: We propose a deep learning-based model for classifying pathological voices using a convolutional neural network and a feedforward neural network. The model uses combinations of heterogeneous parameters, including mel-frequency cepstral coefficients, linear predictive cepstral coefficients and higher-order statistics. We validate the accuracy of this model using the Massachusetts Eye and Ear Infirmary (MEEI) voice disorder database and the Saarbruecken Voice Database (SVD). Our model achieved an accuracy of 99.3% for MEEI and 75.18% for SVD. This model achieved an accuracy that is 7.18% higher than that of competitive models in previous studies.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2020EDL8031/_p
Copy
@ARTICLE{e103-d_8_1920,
author={JiYeoun LEE, Hee-Jin CHOI, },
journal={IEICE TRANSACTIONS on Information},
title={Deep Learning Approaches for Pathological Voice Detection Using Heterogeneous Parameters},
year={2020},
volume={E103-D},
number={8},
pages={1920-1923},
abstract={We propose a deep learning-based model for classifying pathological voices using a convolutional neural network and a feedforward neural network. The model uses combinations of heterogeneous parameters, including mel-frequency cepstral coefficients, linear predictive cepstral coefficients and higher-order statistics. We validate the accuracy of this model using the Massachusetts Eye and Ear Infirmary (MEEI) voice disorder database and the Saarbruecken Voice Database (SVD). Our model achieved an accuracy of 99.3% for MEEI and 75.18% for SVD. This model achieved an accuracy that is 7.18% higher than that of competitive models in previous studies.},
keywords={},
doi={10.1587/transinf.2020EDL8031},
ISSN={1745-1361},
month={August},}
Copy
TY - JOUR
TI - Deep Learning Approaches for Pathological Voice Detection Using Heterogeneous Parameters
T2 - IEICE TRANSACTIONS on Information
SP - 1920
EP - 1923
AU - JiYeoun LEE
AU - Hee-Jin CHOI
PY - 2020
DO - 10.1587/transinf.2020EDL8031
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 8
JA - IEICE TRANSACTIONS on Information
Y1 - August 2020
AB - We propose a deep learning-based model for classifying pathological voices using a convolutional neural network and a feedforward neural network. The model uses combinations of heterogeneous parameters, including mel-frequency cepstral coefficients, linear predictive cepstral coefficients and higher-order statistics. We validate the accuracy of this model using the Massachusetts Eye and Ear Infirmary (MEEI) voice disorder database and the Saarbruecken Voice Database (SVD). Our model achieved an accuracy of 99.3% for MEEI and 75.18% for SVD. This model achieved an accuracy that is 7.18% higher than that of competitive models in previous studies.
ER -