1-2hit |
Mauricio KUGLER Susumu KUROYANAGI Anto Satriyo NUGROHO Akira IWATA
Several research fields have to deal with very large classification problems, e.g. handwritten character recognition and speech recognition. Many works have proposed methods to address problems with large number of samples, but few works have been done concerning problems with large numbers of classes. CombNET-II was one of the first methods proposed for such a kind of task. It consists of a sequential clustering VQ based gating network (stem network) and several Multilayer Perceptron (MLP) based expert classifiers (branch networks). With the objectives of increasing the classification accuracy and providing a more flexible model, this paper proposes a new model based on the CombNET-II structure, the CombNET-III. The new model, intended for, but not limited to, problems with large number of classes, replaces the branch networks MLP with multiclass Support Vector Machines (SVM). It also introduces a new probabilistic framework that outputs posterior class probabilities, enabling the model to be applied in different scenarios (e.g. together with Hidden Markov Models). These changes permit the use of a larger number of smaller clusters, which reduce the complexity of the final classifiers. Moreover, the use of binary SVM with probabilistic outputs and a probabilistic decoding scheme permit the use of a pairwise output encoding on the branch networks, which reduces the computational complexity of the training stage. The experimental results show that the proposed model outperforms both the previous model CombNET-II and a single multiclass SVM, while presenting considerably smaller complexity than the latter. It is also confirmed that CombNET-III classification accuracy scales better with the increasing number of clusters, in comparison with CombNET-II.
Yuji WAIZUMI Nei KATO Kazuki SARUTA Yoshiaki NEMOTO
We propose a rough classification system using Hierarchical Learning Vector Quantization (HLVQ) for large scale classification problems which involve many categories. HLVQ of proposed system divides categories hierarchically in the feature space, makes a tree and multiplies the nodes down the hierarchy. The feature space is divided by a few codebook vectors in each layer. The adjacent feature spaces overlap at the borders. HLVQ classification is both speedy and accurate due to the hierarchical architecture and the overlapping technique. In a classification experiment using ETL9B, the largest database of handwritten characters in Japan, (it contains a total of 607,200 samples from 3036 categories) the speed and accuracy of classification by HLVQ was found to be higher than that by Self-Organizing feature Map (SOM) and Learning Vector Quantization methods. We demonstrate that the classification rate of the proposed system which uses multi-codebook vectors for each category under HLVQ can achieve higher speed and accuracy than that of systems which use average vectors.