IEICE global.ieice.org Site

Author Search Result

[Author] JianFeng WU(3hit)

1-3hit

Vector Quantization of High-Dimensional Speech Spectra Using Deep Neural Network
JianFeng WU HuiBin QIN YongZhu HUA LiHuan SHAO Ji HU ShengYing YANG

LETTER-Artificial Intelligence, Data Mining

Pubricized:
2019/07/02
Vol:
E102-D No:10
Page(s):
2047-2050
This paper proposes a deep neural network (DNN) based framework to address the problem of vector quantization (VQ) for high-dimensional data. The main challenge of applying DNN to VQ is how to reduce the binary coding error of the auto-encoder when the distribution of the coding units is far from binary. To address this problem, three fine-tuning methods have been adopted: 1) adding Gaussian noise to the input of the coding layer, 2) forcing the output of the coding layer to be binary, 3) adding a non-binary penalty term to the loss function. These fine-tuning methods have been extensively evaluated on quantizing speech magnitude spectra. The results demonstrated that each of the methods is useful for improving the coding performance. When implemented for quantizing 968-dimensional speech spectra using only 18-bit, the DNN-based VQ framework achieved an averaged PESQ of about 2.09, which is far beyond the capability of conventional VQ methods.
Pitch Estimation and Voicing Classification Using Reconstructed Spectrum from MFCC
JianFeng WU HuiBin QIN YongZhu HUA LingYan FAN

LETTER-Speech and Hearing

Pubricized:
2017/11/15
Vol:
E101-D No:2
Page(s):
556-559
In this paper, a novel method for pitch estimation and voicing classification is proposed using reconstructed spectrum from Mel-frequency cepstral coefficients (MFCC). The proposed algorithm reconstructs spectrum from MFCC with Moore-Penrose pseudo-inverse by Mel-scale weighting functions. The reconstructed spectrum is compressed and filtered in log-frequency. Pitch estimation is achieved by modeling the joint density of pitch frequency and the filter spectrum with Gaussian Mixture Model (GMM). Voicing classification is also achieved by GMM-based model, and the test results show that over 99% frames can be correctly classified. The results of pitch estimation demonstrate that the proposed GMM-based pitch estimator has high accuracy, and the relative error is 6.68% on TIMIT database.
SDChannelNets: Extremely Small and Efficient Convolutional Neural Networks
JianNan ZHANG JiJun ZHOU JianFeng WU ShengYing YANG

LETTER-Biocybernetics, Neurocomputing

Pubricized:
2019/09/10
Vol:
E102-D No:12
Page(s):
2646-2650
Convolutional neural networks (CNNS) have a strong ability to understand and judge images. However, the enormous parameters and computation of CNNS have limited its application in resource-limited devices. In this letter, we used the idea of parameter sharing and dense connection to compress the parameters in the convolution kernel channel direction, thus greatly reducing the number of model parameters. On this basis, we designed Shared and Dense Channel-wise Convolutional Networks (SDChannelNets), mainly composed of Depth-wise Separable SD-Channel-wise Convolution layer. The advantage of SDChannelNets is that the number of model parameters is greatly reduced without or with little loss of accuracy. We also introduced a hyperparameter that can effectively balance the number of parameters and the accuracy of a model. We evaluated the model proposed by us through two popular image recognition tasks (CIFAR-10 and CIFAR-100). The results showed that SDChannelNets had similar accuracy to other CNNs, but the number of parameters was greatly reduced.

Author Search Result

[Author] JianFeng WU(3hit)

Vector Quantization of High-Dimensional Speech Spectra Using Deep Neural Network

Pitch Estimation and Voicing Classification Using Reconstructed Spectrum from MFCC

SDChannelNets: Extremely Small and Efficient Convolutional Neural Networks

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles