To discuss whether rotational invariance is the main role in spectrogram features, new spectral features based on local normalized center moments, denoted by LNCMSF, are proposed. The proposed LNCMSF firstly adopts 2nd order normalized center moments to describe local energy distribution of the logarithmic energy spectrum, then normalized center moment spectrograms NC1 and NC2 are gained. Secondly, DCT (Discrete Cosine Transform) is used to eliminate the correlation of NC1 and NC2, then high order cepstral coefficients TNC1 and TNC2 are obtained. Finally, LNCMSF is generated by combining NC1, NC2, TNC1 and TNC2. The rotational invariance test experiment shows that the rotational invariance is not a necessary property in partial spectrogram features. The recognition experiment shows that the maximum UA (Unweighted Average of Class-Wise Recall Rate) of LNCMSF are improved by at least 10.7% and 1.2% respectively, compared to that of MFCC (Mel Frequency Cepstrum Coefficient) and HuWSF (Weighted Spectral Features Based on Local Hu Moments).
Huawei TAO
Southeast University
Ruiyu LIANG
Southeast University,Nanjing Institute of Technology
Xinran ZHANG
Southeast University
Li ZHAO
Southeast University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Huawei TAO, Ruiyu LIANG, Xinran ZHANG, Li ZHAO, "Spectral Features Based on Local Normalized Center Moments for Speech Emotion Recognition" in IEICE TRANSACTIONS on Fundamentals,
vol. E99-A, no. 10, pp. 1863-1866, October 2016, doi: 10.1587/transfun.E99.A.1863.
Abstract: To discuss whether rotational invariance is the main role in spectrogram features, new spectral features based on local normalized center moments, denoted by LNCMSF, are proposed. The proposed LNCMSF firstly adopts 2nd order normalized center moments to describe local energy distribution of the logarithmic energy spectrum, then normalized center moment spectrograms NC1 and NC2 are gained. Secondly, DCT (Discrete Cosine Transform) is used to eliminate the correlation of NC1 and NC2, then high order cepstral coefficients TNC1 and TNC2 are obtained. Finally, LNCMSF is generated by combining NC1, NC2, TNC1 and TNC2. The rotational invariance test experiment shows that the rotational invariance is not a necessary property in partial spectrogram features. The recognition experiment shows that the maximum UA (Unweighted Average of Class-Wise Recall Rate) of LNCMSF are improved by at least 10.7% and 1.2% respectively, compared to that of MFCC (Mel Frequency Cepstrum Coefficient) and HuWSF (Weighted Spectral Features Based on Local Hu Moments).
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/transfun.E99.A.1863/_p
Copy
@ARTICLE{e99-a_10_1863,
author={Huawei TAO, Ruiyu LIANG, Xinran ZHANG, Li ZHAO, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Spectral Features Based on Local Normalized Center Moments for Speech Emotion Recognition},
year={2016},
volume={E99-A},
number={10},
pages={1863-1866},
abstract={To discuss whether rotational invariance is the main role in spectrogram features, new spectral features based on local normalized center moments, denoted by LNCMSF, are proposed. The proposed LNCMSF firstly adopts 2nd order normalized center moments to describe local energy distribution of the logarithmic energy spectrum, then normalized center moment spectrograms NC1 and NC2 are gained. Secondly, DCT (Discrete Cosine Transform) is used to eliminate the correlation of NC1 and NC2, then high order cepstral coefficients TNC1 and TNC2 are obtained. Finally, LNCMSF is generated by combining NC1, NC2, TNC1 and TNC2. The rotational invariance test experiment shows that the rotational invariance is not a necessary property in partial spectrogram features. The recognition experiment shows that the maximum UA (Unweighted Average of Class-Wise Recall Rate) of LNCMSF are improved by at least 10.7% and 1.2% respectively, compared to that of MFCC (Mel Frequency Cepstrum Coefficient) and HuWSF (Weighted Spectral Features Based on Local Hu Moments).},
keywords={},
doi={10.1587/transfun.E99.A.1863},
ISSN={1745-1337},
month={October},}
Copy
TY - JOUR
TI - Spectral Features Based on Local Normalized Center Moments for Speech Emotion Recognition
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 1863
EP - 1866
AU - Huawei TAO
AU - Ruiyu LIANG
AU - Xinran ZHANG
AU - Li ZHAO
PY - 2016
DO - 10.1587/transfun.E99.A.1863
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E99-A
IS - 10
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - October 2016
AB - To discuss whether rotational invariance is the main role in spectrogram features, new spectral features based on local normalized center moments, denoted by LNCMSF, are proposed. The proposed LNCMSF firstly adopts 2nd order normalized center moments to describe local energy distribution of the logarithmic energy spectrum, then normalized center moment spectrograms NC1 and NC2 are gained. Secondly, DCT (Discrete Cosine Transform) is used to eliminate the correlation of NC1 and NC2, then high order cepstral coefficients TNC1 and TNC2 are obtained. Finally, LNCMSF is generated by combining NC1, NC2, TNC1 and TNC2. The rotational invariance test experiment shows that the rotational invariance is not a necessary property in partial spectrogram features. The recognition experiment shows that the maximum UA (Unweighted Average of Class-Wise Recall Rate) of LNCMSF are improved by at least 10.7% and 1.2% respectively, compared to that of MFCC (Mel Frequency Cepstrum Coefficient) and HuWSF (Weighted Spectral Features Based on Local Hu Moments).
ER -