We propose a non-invasive and cost-effective method to automatically detect dementia by utilizing solely speech audio data. We extract paralinguistic features for a short speech segment and use Gated Convolutional Neural Networks (GCNN) to classify it into dementia or healthy. We evaluate our method on the Pitt Corpus and on our own dataset, the PROMPT Database. Our method yields the accuracy of 73.1% on the Pitt Corpus using an average of 114 seconds of speech data. In the PROMPT Database, our method yields the accuracy of 74.7% using 4 seconds of speech data and it improves to 80.8% when we use all the patient's speech data. Furthermore, we evaluate our method on a three-class classification problem in which we included the Mild Cognitive Impairment (MCI) class and achieved the accuracy of 60.6% with 40 seconds of speech data.
Mariana RODRIGUES MAKIUCHI
Tokyo Institute of Technology
Tifani WARNITA
Tokyo Institute of Technology
Nakamasa INOUE
Tokyo Institute of Technology
Koichi SHINODA
Tokyo Institute of Technology
Michitaka YOSHIMURA
Keio University School of Medicine
Momoko KITAZAWA
Keio University School of Medicine
Kei FUNAKI
Keio University School of Medicine
Yoko EGUCHI
Keio University School of Medicine
Taishiro KISHIMOTO
Keio University School of Medicine
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Mariana RODRIGUES MAKIUCHI, Tifani WARNITA, Nakamasa INOUE, Koichi SHINODA, Michitaka YOSHIMURA, Momoko KITAZAWA, Kei FUNAKI, Yoko EGUCHI, Taishiro KISHIMOTO, "Speech Paralinguistic Approach for Detecting Dementia Using Gated Convolutional Neural Network" in IEICE TRANSACTIONS on Information,
vol. E104-D, no. 11, pp. 1930-1940, November 2021, doi: 10.1587/transinf.2020EDP7196.
Abstract: We propose a non-invasive and cost-effective method to automatically detect dementia by utilizing solely speech audio data. We extract paralinguistic features for a short speech segment and use Gated Convolutional Neural Networks (GCNN) to classify it into dementia or healthy. We evaluate our method on the Pitt Corpus and on our own dataset, the PROMPT Database. Our method yields the accuracy of 73.1% on the Pitt Corpus using an average of 114 seconds of speech data. In the PROMPT Database, our method yields the accuracy of 74.7% using 4 seconds of speech data and it improves to 80.8% when we use all the patient's speech data. Furthermore, we evaluate our method on a three-class classification problem in which we included the Mild Cognitive Impairment (MCI) class and achieved the accuracy of 60.6% with 40 seconds of speech data.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2020EDP7196/_p
Copy
@ARTICLE{e104-d_11_1930,
author={Mariana RODRIGUES MAKIUCHI, Tifani WARNITA, Nakamasa INOUE, Koichi SHINODA, Michitaka YOSHIMURA, Momoko KITAZAWA, Kei FUNAKI, Yoko EGUCHI, Taishiro KISHIMOTO, },
journal={IEICE TRANSACTIONS on Information},
title={Speech Paralinguistic Approach for Detecting Dementia Using Gated Convolutional Neural Network},
year={2021},
volume={E104-D},
number={11},
pages={1930-1940},
abstract={We propose a non-invasive and cost-effective method to automatically detect dementia by utilizing solely speech audio data. We extract paralinguistic features for a short speech segment and use Gated Convolutional Neural Networks (GCNN) to classify it into dementia or healthy. We evaluate our method on the Pitt Corpus and on our own dataset, the PROMPT Database. Our method yields the accuracy of 73.1% on the Pitt Corpus using an average of 114 seconds of speech data. In the PROMPT Database, our method yields the accuracy of 74.7% using 4 seconds of speech data and it improves to 80.8% when we use all the patient's speech data. Furthermore, we evaluate our method on a three-class classification problem in which we included the Mild Cognitive Impairment (MCI) class and achieved the accuracy of 60.6% with 40 seconds of speech data.},
keywords={},
doi={10.1587/transinf.2020EDP7196},
ISSN={1745-1361},
month={November},}
Copy
TY - JOUR
TI - Speech Paralinguistic Approach for Detecting Dementia Using Gated Convolutional Neural Network
T2 - IEICE TRANSACTIONS on Information
SP - 1930
EP - 1940
AU - Mariana RODRIGUES MAKIUCHI
AU - Tifani WARNITA
AU - Nakamasa INOUE
AU - Koichi SHINODA
AU - Michitaka YOSHIMURA
AU - Momoko KITAZAWA
AU - Kei FUNAKI
AU - Yoko EGUCHI
AU - Taishiro KISHIMOTO
PY - 2021
DO - 10.1587/transinf.2020EDP7196
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E104-D
IS - 11
JA - IEICE TRANSACTIONS on Information
Y1 - November 2021
AB - We propose a non-invasive and cost-effective method to automatically detect dementia by utilizing solely speech audio data. We extract paralinguistic features for a short speech segment and use Gated Convolutional Neural Networks (GCNN) to classify it into dementia or healthy. We evaluate our method on the Pitt Corpus and on our own dataset, the PROMPT Database. Our method yields the accuracy of 73.1% on the Pitt Corpus using an average of 114 seconds of speech data. In the PROMPT Database, our method yields the accuracy of 74.7% using 4 seconds of speech data and it improves to 80.8% when we use all the patient's speech data. Furthermore, we evaluate our method on a three-class classification problem in which we included the Mild Cognitive Impairment (MCI) class and achieved the accuracy of 60.6% with 40 seconds of speech data.
ER -