In this paper, we propose a robust speech/music classification algorithm to improve the performance of speech/music classification in the selectable mode vocoder (SMV) of 3GPP2 using deep belief networks (DBNs), which is a powerful hierarchical generative model for feature extraction and can determine the underlying discriminative characteristic of the extracted features. The six feature vectors selected from the relevant parameters of the SMV are applied to the visible layer in the proposed DBN-based method. The performance of the proposed algorithm is evaluated using the detection accuracy and error probability of speech and music for various music genres. The proposed algorithm yields better results when compared with the original SMV method and support vector machine (SVM) based method.
Ji-Hyun SONG
Inha University
Hong-Sub AN
Inha University
Sangmin LEE
Inha University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Ji-Hyun SONG, Hong-Sub AN, Sangmin LEE, "Speech/Music Classification Enhancement for 3GPP2 SMV Codec Based on Deep Belief Networks" in IEICE TRANSACTIONS on Fundamentals,
vol. E97-A, no. 2, pp. 661-664, February 2014, doi: 10.1587/transfun.E97.A.661.
Abstract: In this paper, we propose a robust speech/music classification algorithm to improve the performance of speech/music classification in the selectable mode vocoder (SMV) of 3GPP2 using deep belief networks (DBNs), which is a powerful hierarchical generative model for feature extraction and can determine the underlying discriminative characteristic of the extracted features. The six feature vectors selected from the relevant parameters of the SMV are applied to the visible layer in the proposed DBN-based method. The performance of the proposed algorithm is evaluated using the detection accuracy and error probability of speech and music for various music genres. The proposed algorithm yields better results when compared with the original SMV method and support vector machine (SVM) based method.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/transfun.E97.A.661/_p
Copy
@ARTICLE{e97-a_2_661,
author={Ji-Hyun SONG, Hong-Sub AN, Sangmin LEE, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Speech/Music Classification Enhancement for 3GPP2 SMV Codec Based on Deep Belief Networks},
year={2014},
volume={E97-A},
number={2},
pages={661-664},
abstract={In this paper, we propose a robust speech/music classification algorithm to improve the performance of speech/music classification in the selectable mode vocoder (SMV) of 3GPP2 using deep belief networks (DBNs), which is a powerful hierarchical generative model for feature extraction and can determine the underlying discriminative characteristic of the extracted features. The six feature vectors selected from the relevant parameters of the SMV are applied to the visible layer in the proposed DBN-based method. The performance of the proposed algorithm is evaluated using the detection accuracy and error probability of speech and music for various music genres. The proposed algorithm yields better results when compared with the original SMV method and support vector machine (SVM) based method.},
keywords={},
doi={10.1587/transfun.E97.A.661},
ISSN={1745-1337},
month={February},}
Copy
TY - JOUR
TI - Speech/Music Classification Enhancement for 3GPP2 SMV Codec Based on Deep Belief Networks
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 661
EP - 664
AU - Ji-Hyun SONG
AU - Hong-Sub AN
AU - Sangmin LEE
PY - 2014
DO - 10.1587/transfun.E97.A.661
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E97-A
IS - 2
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - February 2014
AB - In this paper, we propose a robust speech/music classification algorithm to improve the performance of speech/music classification in the selectable mode vocoder (SMV) of 3GPP2 using deep belief networks (DBNs), which is a powerful hierarchical generative model for feature extraction and can determine the underlying discriminative characteristic of the extracted features. The six feature vectors selected from the relevant parameters of the SMV are applied to the visible layer in the proposed DBN-based method. The performance of the proposed algorithm is evaluated using the detection accuracy and error probability of speech and music for various music genres. The proposed algorithm yields better results when compared with the original SMV method and support vector machine (SVM) based method.
ER -