The search functionality is under construction.

IEICE TRANSACTIONS on Fundamentals

Speech Reconstruction from MFCC Based on Nonnegative and Sparse Priors

Gang MIN, Xiong wei ZHANG, Ji bin YANG, Xia ZOU, Zhi song PAN

  • Full Text Views

    0

  • Cite this

Summary :

In this letter, high quality speech reconstruction approaches from Mel-frequency cepstral coefficients (MFCC) are presented. Taking into account of the nonnegative and sparse properties of the speech power spectrum, an alternating direction method of multipliers (ADMM) based nonnegative l2 norm (NL2) and weighted nonnegative l2 norm (NWL2) minimization approach is proposed to cope with the under-determined nature of the reconstruction problem. The phase spectrum is recovered by the well-known LSE-ISTFTM algorithm. Experimental results demonstrate that the NL2 and NWL2 approach substantially achieves better quality for reconstructed speech than the conventional l2 norm minimization approach, it sounds very close to the original speech when using the high-resolution MFCC, the PESQ score reaches 4.0.

Publication
IEICE TRANSACTIONS on Fundamentals Vol.E98-A No.7 pp.1540-1543
Publication Date
2015/07/01
Publicized
Online ISSN
1745-1337
DOI
10.1587/transfun.E98.A.1540
Type of Manuscript
LETTER
Category
Speech and Hearing

Authors

Gang MIN
  PLA University of Science and Technology
Xiong wei ZHANG
  PLA University of Science and Technology
Ji bin YANG
  PLA University of Science and Technology
Xia ZOU
  PLA University of Science and Technology
Zhi song PAN
  PLA University of Science and Technology

Keyword