In this paper, a novel method for pitch estimation and voicing classification is proposed using reconstructed spectrum from Mel-frequency cepstral coefficients (MFCC). The proposed algorithm reconstructs spectrum from MFCC with Moore-Penrose pseudo-inverse by Mel-scale weighting functions. The reconstructed spectrum is compressed and filtered in log-frequency. Pitch estimation is achieved by modeling the joint density of pitch frequency and the filter spectrum with Gaussian Mixture Model (GMM). Voicing classification is also achieved by GMM-based model, and the test results show that over 99% frames can be correctly classified. The results of pitch estimation demonstrate that the proposed GMM-based pitch estimator has high accuracy, and the relative error is 6.68% on TIMIT database.
JianFeng WU
Hangzhou Dianzi University
HuiBin QIN
Hangzhou Dianzi University
YongZhu HUA
Hangzhou Dianzi University
LingYan FAN
Hangzhou Dianzi University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
JianFeng WU, HuiBin QIN, YongZhu HUA, LingYan FAN, "Pitch Estimation and Voicing Classification Using Reconstructed Spectrum from MFCC" in IEICE TRANSACTIONS on Information,
vol. E101-D, no. 2, pp. 556-559, February 2018, doi: 10.1587/transinf.2017EDL8162.
Abstract: In this paper, a novel method for pitch estimation and voicing classification is proposed using reconstructed spectrum from Mel-frequency cepstral coefficients (MFCC). The proposed algorithm reconstructs spectrum from MFCC with Moore-Penrose pseudo-inverse by Mel-scale weighting functions. The reconstructed spectrum is compressed and filtered in log-frequency. Pitch estimation is achieved by modeling the joint density of pitch frequency and the filter spectrum with Gaussian Mixture Model (GMM). Voicing classification is also achieved by GMM-based model, and the test results show that over 99% frames can be correctly classified. The results of pitch estimation demonstrate that the proposed GMM-based pitch estimator has high accuracy, and the relative error is 6.68% on TIMIT database.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2017EDL8162/_p
Copy
@ARTICLE{e101-d_2_556,
author={JianFeng WU, HuiBin QIN, YongZhu HUA, LingYan FAN, },
journal={IEICE TRANSACTIONS on Information},
title={Pitch Estimation and Voicing Classification Using Reconstructed Spectrum from MFCC},
year={2018},
volume={E101-D},
number={2},
pages={556-559},
abstract={In this paper, a novel method for pitch estimation and voicing classification is proposed using reconstructed spectrum from Mel-frequency cepstral coefficients (MFCC). The proposed algorithm reconstructs spectrum from MFCC with Moore-Penrose pseudo-inverse by Mel-scale weighting functions. The reconstructed spectrum is compressed and filtered in log-frequency. Pitch estimation is achieved by modeling the joint density of pitch frequency and the filter spectrum with Gaussian Mixture Model (GMM). Voicing classification is also achieved by GMM-based model, and the test results show that over 99% frames can be correctly classified. The results of pitch estimation demonstrate that the proposed GMM-based pitch estimator has high accuracy, and the relative error is 6.68% on TIMIT database.},
keywords={},
doi={10.1587/transinf.2017EDL8162},
ISSN={1745-1361},
month={February},}
Copy
TY - JOUR
TI - Pitch Estimation and Voicing Classification Using Reconstructed Spectrum from MFCC
T2 - IEICE TRANSACTIONS on Information
SP - 556
EP - 559
AU - JianFeng WU
AU - HuiBin QIN
AU - YongZhu HUA
AU - LingYan FAN
PY - 2018
DO - 10.1587/transinf.2017EDL8162
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E101-D
IS - 2
JA - IEICE TRANSACTIONS on Information
Y1 - February 2018
AB - In this paper, a novel method for pitch estimation and voicing classification is proposed using reconstructed spectrum from Mel-frequency cepstral coefficients (MFCC). The proposed algorithm reconstructs spectrum from MFCC with Moore-Penrose pseudo-inverse by Mel-scale weighting functions. The reconstructed spectrum is compressed and filtered in log-frequency. Pitch estimation is achieved by modeling the joint density of pitch frequency and the filter spectrum with Gaussian Mixture Model (GMM). Voicing classification is also achieved by GMM-based model, and the test results show that over 99% frames can be correctly classified. The results of pitch estimation demonstrate that the proposed GMM-based pitch estimator has high accuracy, and the relative error is 6.68% on TIMIT database.
ER -