Applying Sparse KPCA for Feature Extraction in Speech Recognition

Amaro LIMA; Heiga ZEN; Yoshihiko NANKAKU; Keiichi TOKUDA; Tadashi KITAMURA; Fernando G. RESENDE

doi:10.1093/ietisy/e88-d.3.401

IEICE TRANSACTIONS on Information

Applying Sparse KPCA for Feature Extraction in Speech Recognition

Amaro LIMA, Heiga ZEN, Yoshihiko NANKAKU, Keiichi TOKUDA, Tadashi KITAMURA, Fernando G. RESENDE

Full Text Views

0

Cite this

Summary :

This paper presents an analysis of the applicability of Sparse Kernel Principal Component Analysis (SKPCA) for feature extraction in speech recognition, as well as, a proposed approach to make the SKPCA technique realizable for a large amount of training data, which is an usual context in speech recognition systems. Although the KPCA (Kernel Principal Component Analysis) has proved to be an efficient technique for being applied to speech recognition, it has the disadvantage of requiring training data reduction, when its amount is excessively large. This data reduction is important to avoid computational unfeasibility and/or an extremely high computational burden related to the feature representation step of the training and the test data evaluations. The standard approach to perform this data reduction is to randomly choose frames from the original data set, which does not necessarily provide a good statistical representation of the original data set. In order to solve this problem a likelihood related re-estimation procedure was applied to the KPCA framework, thus creating the SKPCA, which nevertheless is not realizable for large training databases. The proposed approach consists in clustering the training data and applying to these clusters a SKPCA like data reduction technique generating the reduced data clusters. These reduced data clusters are merged and reduced in a recursive procedure until just one cluster is obtained, making the SKPCA approach realizable for a large amount of training data. The experimental results show the efficiency of SKPCA technique with the proposed approach over the KPCA with the standard sparse solution using randomly chosen frames and the standard feature extraction techniques.

Publication: IEICE TRANSACTIONS on Information Vol.E88-D No.3 pp.401-409

Publication Date: 2005/03/01

Publicized

Online ISSN

DOI: 10.1093/ietisy/e88-d.3.401

Type of Manuscript: Special Section PAPER (Special Section on Corpus-Based Speech Technologies)

Category: Feature Extraction and Acoustic Medelings

Cite this

Copy

Amaro LIMA, Heiga ZEN, Yoshihiko NANKAKU, Keiichi TOKUDA, Tadashi KITAMURA, Fernando G. RESENDE, "Applying Sparse KPCA for Feature Extraction in Speech Recognition" in IEICE TRANSACTIONS on Information, vol. E88-D, no. 3, pp. 401-409, March 2005, doi: 10.1093/ietisy/e88-d.3.401.
Abstract: This paper presents an analysis of the applicability of Sparse Kernel Principal Component Analysis (SKPCA) for feature extraction in speech recognition, as well as, a proposed approach to make the SKPCA technique realizable for a large amount of training data, which is an usual context in speech recognition systems. Although the KPCA (Kernel Principal Component Analysis) has proved to be an efficient technique for being applied to speech recognition, it has the disadvantage of requiring training data reduction, when its amount is excessively large. This data reduction is important to avoid computational unfeasibility and/or an extremely high computational burden related to the feature representation step of the training and the test data evaluations. The standard approach to perform this data reduction is to randomly choose frames from the original data set, which does not necessarily provide a good statistical representation of the original data set. In order to solve this problem a likelihood related re-estimation procedure was applied to the KPCA framework, thus creating the SKPCA, which nevertheless is not realizable for large training databases. The proposed approach consists in clustering the training data and applying to these clusters a SKPCA like data reduction technique generating the reduced data clusters. These reduced data clusters are merged and reduced in a recursive procedure until just one cluster is obtained, making the SKPCA approach realizable for a large amount of training data. The experimental results show the efficiency of SKPCA technique with the proposed approach over the KPCA with the standard sparse solution using randomly chosen frames and the standard feature extraction techniques.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e88-d.3.401/_p

Copy

@ARTICLE{e88-d_3_401,
author={Amaro LIMA, Heiga ZEN, Yoshihiko NANKAKU, Keiichi TOKUDA, Tadashi KITAMURA, Fernando G. RESENDE, },
journal={IEICE TRANSACTIONS on Information},
title={Applying Sparse KPCA for Feature Extraction in Speech Recognition},
year={2005},
volume={E88-D},
number={3},
pages={401-409},
abstract={This paper presents an analysis of the applicability of Sparse Kernel Principal Component Analysis (SKPCA) for feature extraction in speech recognition, as well as, a proposed approach to make the SKPCA technique realizable for a large amount of training data, which is an usual context in speech recognition systems. Although the KPCA (Kernel Principal Component Analysis) has proved to be an efficient technique for being applied to speech recognition, it has the disadvantage of requiring training data reduction, when its amount is excessively large. This data reduction is important to avoid computational unfeasibility and/or an extremely high computational burden related to the feature representation step of the training and the test data evaluations. The standard approach to perform this data reduction is to randomly choose frames from the original data set, which does not necessarily provide a good statistical representation of the original data set. In order to solve this problem a likelihood related re-estimation procedure was applied to the KPCA framework, thus creating the SKPCA, which nevertheless is not realizable for large training databases. The proposed approach consists in clustering the training data and applying to these clusters a SKPCA like data reduction technique generating the reduced data clusters. These reduced data clusters are merged and reduced in a recursive procedure until just one cluster is obtained, making the SKPCA approach realizable for a large amount of training data. The experimental results show the efficiency of SKPCA technique with the proposed approach over the KPCA with the standard sparse solution using randomly chosen frames and the standard feature extraction techniques.},
keywords={},
doi={10.1093/ietisy/e88-d.3.401},
ISSN={},
month={March},}

Copy

TY - JOUR
TI - Applying Sparse KPCA for Feature Extraction in Speech Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 401
EP - 409
AU - Amaro LIMA
AU - Heiga ZEN
AU - Yoshihiko NANKAKU
AU - Keiichi TOKUDA
AU - Tadashi KITAMURA
AU - Fernando G. RESENDE
PY - 2005
DO - 10.1093/ietisy/e88-d.3.401
JO - IEICE TRANSACTIONS on Information
SN -
VL - E88-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2005
AB - This paper presents an analysis of the applicability of Sparse Kernel Principal Component Analysis (SKPCA) for feature extraction in speech recognition, as well as, a proposed approach to make the SKPCA technique realizable for a large amount of training data, which is an usual context in speech recognition systems. Although the KPCA (Kernel Principal Component Analysis) has proved to be an efficient technique for being applied to speech recognition, it has the disadvantage of requiring training data reduction, when its amount is excessively large. This data reduction is important to avoid computational unfeasibility and/or an extremely high computational burden related to the feature representation step of the training and the test data evaluations. The standard approach to perform this data reduction is to randomly choose frames from the original data set, which does not necessarily provide a good statistical representation of the original data set. In order to solve this problem a likelihood related re-estimation procedure was applied to the KPCA framework, thus creating the SKPCA, which nevertheless is not realizable for large training databases. The proposed approach consists in clustering the training data and applying to these clusters a SKPCA like data reduction technique generating the reduced data clusters. These reduced data clusters are merged and reduced in a recursive procedure until just one cluster is obtained, making the SKPCA approach realizable for a large amount of training data. The experimental results show the efficiency of SKPCA technique with the proposed approach over the KPCA with the standard sparse solution using randomly chosen frames and the standard feature extraction techniques.
ER -

IEICE TRANSACTIONS on Information

Applying Sparse KPCA for Feature Extraction in Speech Recognition

Summary :

Authors

Keyword

Latest Issue

Contents

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles

IEICE TRANSACTIONS on Information

Applying Sparse KPCA for Feature Extraction in Speech Recognition

Summary :

Authors

Keyword

Latest Issue

Contents

Copyrights notice of machine-translated contents

Cite this

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles