The search functionality is under construction.

IEICE TRANSACTIONS on Information

Applying Sparse KPCA for Feature Extraction in Speech Recognition

Amaro LIMA, Heiga ZEN, Yoshihiko NANKAKU, Keiichi TOKUDA, Tadashi KITAMURA, Fernando G. RESENDE

  • Full Text Views

    0

  • Cite this

Summary :

This paper presents an analysis of the applicability of Sparse Kernel Principal Component Analysis (SKPCA) for feature extraction in speech recognition, as well as, a proposed approach to make the SKPCA technique realizable for a large amount of training data, which is an usual context in speech recognition systems. Although the KPCA (Kernel Principal Component Analysis) has proved to be an efficient technique for being applied to speech recognition, it has the disadvantage of requiring training data reduction, when its amount is excessively large. This data reduction is important to avoid computational unfeasibility and/or an extremely high computational burden related to the feature representation step of the training and the test data evaluations. The standard approach to perform this data reduction is to randomly choose frames from the original data set, which does not necessarily provide a good statistical representation of the original data set. In order to solve this problem a likelihood related re-estimation procedure was applied to the KPCA framework, thus creating the SKPCA, which nevertheless is not realizable for large training databases. The proposed approach consists in clustering the training data and applying to these clusters a SKPCA like data reduction technique generating the reduced data clusters. These reduced data clusters are merged and reduced in a recursive procedure until just one cluster is obtained, making the SKPCA approach realizable for a large amount of training data. The experimental results show the efficiency of SKPCA technique with the proposed approach over the KPCA with the standard sparse solution using randomly chosen frames and the standard feature extraction techniques.

Publication
IEICE TRANSACTIONS on Information Vol.E88-D No.3 pp.401-409
Publication Date
2005/03/01
Publicized
Online ISSN
DOI
10.1093/ietisy/e88-d.3.401
Type of Manuscript
Special Section PAPER (Special Section on Corpus-Based Speech Technologies)
Category
Feature Extraction and Acoustic Medelings

Authors

Keyword