Tatsuhiko KAGEHIRO Hiroto NAGAYOSHI Hiroshi SAKO
This paper describes a method for the classification of bank-notes. The algorithm has three stages, and classifies bank-notes with very low error rates and at high speeds. To achieve the very low error rates, the result of classification is checked in the final stage by using different features to those used in the first two. High-speed processing is mainly achieved by the hierarchical structure, which leads to low computational costs. In evaluation on 32,850 samples of US bank-notes, with the same number used for training, the algorithm classified all samples precisely with no error sample. We estimate that the worst error rate is 3.1E-9 for the classification statistically.
Jose GARCIA RODRIGUEZ Anastassia ANGELOPOULOU Alexandra PSARROU
A new method for automatically building statistical shape models from a set of training examples and in particular from a class of hands. In this study, we utilise a novel approach to automatically recover the shape of hand outlines from a series of 2D training images. Automated landmark extraction is accomplished through the use of the self-organising model the growing neural gas (GNG) network, which is able to learn and preserve the topological relations of a given set of input patterns without requiring a priori knowledge of the structure of the input space. The GNG is compared to other self-organising networks such as Kohonen and Neural Gas (NG) maps and results are given for the training set of hand outlines, showing that the proposed method preserves accurate models.
A multi-stage approach -- which is fast, robust and easy to train -- for a face-detection system is proposed. Motivated by the work of Viola and Jones [1], this approach uses a cascade of classifiers to yield a coarse-to-fine strategy to reduce significantly detection time while maintaining a high detection rate. However, it is distinguished from previous work by two features. First, a new stage has been added to detect face candidate regions more quickly by using a larger window size and larger moving step size. Second, support vector machine (SVM) classifiers are used instead of AdaBoost classifiers in the last stage, and Haar wavelet features selected by the previous stage are reused for the SVM classifiers robustly and efficiently. By combining AdaBoost and SVM classifiers, the final system can achieve both fast and robust detection because most non-face patterns are rejected quickly in earlier layers, while only a small number of promising face patterns are classified robustly in later layers. The proposed multi-stage-based system has been shown to run faster than the original AdaBoost-based system while maintaining comparable accuracy.
Kazuya HARAGUCHI Toshihide IBARAKI
We consider the classification problem to construct a classifier c:{0,1}n
Jeong-Hoon CHO Dae-Geun JANG Chan-Sik HWANG
Shadow detection and removal is important to deal with traffic image sequences. Cast shadow of vehicle may lead to an inaccurate object feature extraction and erroneous scene analysis. Furthermore, separate vehicles can be connected through shadow. Both can confuse object recognition systems. In this paper, a robust method is proposed for detecting and removing active cast shadow in monocular color image sequences. Background subtraction method is used to extract moving blobs in color and gradient dimensions, and the YCrCb color space is adopted for detecting and removing the cast shadow. Even when shadows link different vehicles, it can detect the each vehicle figure using modified mask by shadow bar. Experimental results from town scenes show that proposed method is effective and the classification accuracy is sufficient for general vehicle type classification.
Bo-Yeong KANG Sung-Hyon MYAENG
Since sentences are the basic propositional units of text, knowing their themes should help in completing various tasks such as automatic summarization requiring the knowledge about the semantic content of text. Despite the importance of determining the theme of a sentence, however, few studies have investigated the problem of automatically assigning a theme to a sentence. In this paper, we examine the notion of sentence theme and propose an automatic scheme where head-driven patterns are used for theme assignment. We tested our scheme with sentences in encyclopedia articles and obtained a promising result of 98.96% in F-score for training data and 88.57% for testing data, which outperform the baseline using all but the head-driven patterns.
Mutsumi KIMURA Yuji HARA Hiroyuki HARA Tomoyuki OKUYAMA Satoshi INOUE Tatsuya SHIMODA
Driving methods for TFT-OLEDs are explained with their features and classified from the viewpoints of grayscale methods and uniformizing methods. This classification leads us to a novel proposal using time ratio grayscale and current uniformization. This driving method maintains current uniformity and simultaneously overcomes charging shortage of the pixel circuit for low grayscale levels and current variation due to the shift of operating points. Tolerance toward degraded characteristics, linearity of grayscale and luminance uniformity against degraded characteristics are confirmed using circuit simulation.
This paper describes a pattern classifier for detecting frontal-view faces via learning a decision boundary. The proposed classifier consists of two major parts for improving classification accuracy: the implicit modeling of both the face and the near-face classes resulting in an extended discriminative feature set, and the subsequent composite Support Vector Machines (SVMs) for speeding up the classification. For the extended discriminative feature set, Principal Component Analysis (PCA) or Independent Component Analysis (ICA) is performed for the face and near-face classes separately. The projections and distances to the two different subspaces are complementary, which significantly enhances classification accuracy of SVM. Multiple nonlinear SVMs are trained for the local facial feature spaces considering the general multi-modal characteristic of the face space. Each component SVM has a simpler boundary than that of a single SVM for the whole face space. The most appropriate component SVM is selected by a gating mechanism based on clustering. The classification by utilizing one of the multiple SVMs guarantees good generalization performance and speeds up face detection. The proposed classifier is finally implemented to work in real-time by cascading a boosting based face detector.
Yousun KANG Ken'ichi MOROOKA Hiroshi NAGAHASHI
As a representative of the linear discriminant analysis, the Fisher method is most widely used in practice and it is very effective in two-class classification. However, when it is expanded to a multi-class classification problem, the precision of its discrimination may become worse. A main reason is an occurrence of overlapped distributions on the discriminant space built by Fisher criterion. In order to take such overlaps among classes into consideration, our approach builds a new discriminant space by hierarchically classifying the overlapped classes. In this paper, we propose a new hierarchical discriminant analysis for texture classification. We divide the discriminant space into subspaces by recursively grouping the overlapped classes. In the experiment, texture images from many classes are classified based on the proposed method. We show the outstanding result compared with the conventional Fisher method.
ChenGuang ZHOU Kui MENG ZuLian QIU
This paper present three characteristic functions which can express the luminance distribute characteristic much better. Based on these functions a region classification algorithm is presented. The algorithm can offer more information on regions' similarity and greatly improve the efficiency and performance of match seeking in fractal coding. It can be widely applied to many kinds of fractal coding algorithms. Analysis and experimental results proved that it can offer more information on luminance distribute characteristics among regions and greatly improve the decoding quality and compression ratio with holding the running speed.
Mitsuharu MATSUMOTO Shuji HASHIMOTO
This paper introduces the multiple signal classification (MUSIC) method that utilizes the transfer characteristics of microphones located at the same place, namely aggregated microphones. The conventional microphone array realizes a sound localization system according to the differences in the arrival time, phase shift, and the level of the sound wave among each microphone. Therefore, it is difficult to miniaturize the microphone array. The objective of our research is to build a reliable miniaturized sound localization system using aggregated microphones. In this paper, we describe a sound system with N microphones. We then show that the microphone array system and the proposed aggregated microphone system can be described in the same framework. We apply the multiple signal classification to the method that utilizes the transfer characteristics of the microphones placed at a same location and compare the proposed method with the microphone array. In the proposed method, all microphones are placed at the same place. Hence, it is easy to miniaturize the system. This feature is considered to be useful for practical applications. The experimental results obtained in an ordinary room are shown to verify the validity of the measurement.
Sang-Bum KIM Hae-Chang RIM Jin-Dong KIM
The multinomial naive Bayes model has been widely used for probabilistic text classification. However, the parameter estimation for this model sometimes generates inappropriate probabilities. In this paper, we propose a topic document model for the multinomial naive Bayes text classification, where the parameters are estimated from normalized term frequencies of each training document. Experiments are conducted on Reuters 21578 and 20 Newsgroup collections, and our proposed approach obtained a significant improvement in performance compared to the traditional multinomial naive Bayes.
Hiroaki HONDA Hideki TODE Koso MURAKAMI
In the next-generation networks, ultra high-speed data transmission will become necessary to support a variety of advanced point-to-point and multipoint multimedia services with stringent quality-of-service (QoS) constraints. Such a requirement desires the realization of optical WDM networks. Researches on multicast in optical WDM networks have become active for the purpose of efficient use of wavelength resources. Since multiple channels are more likely to share the same links in WDM multicast, effective routing and wavelength assignment (RWA) technology becomes very important. The introduction of the wavelength conversion technology leads to more efficient use of wavelength resources. This technology, however, has problems to be solved, and the number of wavelength converters will be restricted in the network. In this paper, we propose an effective WDM multicast design method on condition that wavelength converters on each switching node are restricted, which consists of three separate steps: routing, wavelength converter allocation, and wavelength assignment. In our proposal, preferentially available waveband is classified according to the scale of multicast group. Assuming that the number of wavelength converters on each switching node is limited, we evaluate its performance from a viewpoint of the call blocking probability.
Pi-Chung WANG Hung-Yi CHANG Chia-Tai CHAN Shuo-Cheng HU
Packet classification is important in fulfilling the requirements of differentiated services in next generation networks. One of interesting hardware solutions proposed to solve the packet classification problem is bit vector algorithm. Different from other hardware solutions such as ternary CAM, it efficiently utilizes the memories to achieve an excellent performance in medium size policy database; however, it exhibits poor worst-case performance with a potentially large number of policies. In this paper, we proposed an improved bit-vector algorithm named Condensate Bit Vector which can be adapted to large policy databases in the backbone network. Experiments showed that our proposed algorithm drastically improves in the storage requirements and search speed as compared to the original algorithm.
Hiroyoshi YAMAMOTO Yoshihiko NANKAKU Chiyomi MIYAJIMA Keiichi TOKUDA Tadashi KITAMURA
This paper investigates the parameter tying structures of a mixture of factor analyzers (MFA) and discriminative training of MFA for speaker identification. The parameters of factor loading matrices or diagonal matrices are shared in different mixtures of MFA. Then, minimum classification error (MCE) training is applied to the MFA parameters to enhance the discrimination ability. The result of a text-independent speaker identification experiment shows that MFA outperforms the conventional Gaussian mixture model (GMM) with diagonal or full covariance matrices and achieves the best performance when sharing the diagonal matrices, resulting in a relative gain of 26% over the GMM with diagonal covariance matrices. The improvement is more significant especially in sparse training data condition. The recognition performance is further improved by MCE training with an additional gain of 3% error reduction.
Cheng-Chin CHIANG Chi-Lun HUANG
This paper presents the design of an automatic surveillance system to monitor the dangerous non-frontal gazes of the car driver. To track the driver's eyes, we propose a novel filter to locate the "between-eye", which is the middle point between the two eyes, to help the fast locating of eyes. We also propose a specially designed criterion function named mean ratio function to accurately locate the positions of eyes. To analyze the gazes of the driver, a multilayer perceptron neural network is trained to examine whether the driver is losing the proper gaze or not. By incorporating the neural network output with some well-designed alarm-issuing rules, the system performs the monitoring task for single dedicated driver and multiple different drivers with a satisfied performance in our experiments.
Jinqing QI Dongju LI Tsuyoshi ISSHIKI Hiroaki KUNIEDA
A new and fast fingerprint classification method based on direction patterns is presented in this paper. This method is developed to be applicable to today's embedded fingerprint authentication system, in which small area sensors are widely used. Direction patterns are well treated in the direction map at block level, where each block consists of 88 pixels. It is demonstrated that the search of directions pattern in specific area, generally called as pattern area, is able to classify fingerprints clearly and quickly. With our algorithm, the classification accuracy of 89% is achieved over 4000 images in the NIST-4 database, slightly lower than the conventional approaches. However, the classification speed is improved tremendously up to about 10 times as fast as conventional singular point approaches.
In this letter we suggest sets of features to classify genres of web documents. Web documents are different from textual documents in that they contain URL and HTML tags within the pages. We introduce the features specific to web documents, which are extracted from URL and HTML tags. Experimental results enable us to evaluate their characteristics and performances. On the basis of the experimental results, we implement a user interface of a web search engine that presents documents grouped by genres.
This paper proposes the use of the ratio of wavelet extrema numbers taken from the horizontal and vertical counts respectively as a texture feature, which is called aspect ratio of extrema number (AREN). We formulate the classification problem upon natural and synthesized texture images as an optimization problem and develop a coevolving approach to select both scalar wavelet and multiwavelet feature spaces of greater discriminatory power. Sequential searches and genetic algorithms (GAs) are comparatively investigated. The experiments using wavelet packet decompositions with the innovative packet-tree selection scheme ascertain that the classification accuracy of coevolutionary genetic algorithms (CGAs) is acceptable enough.
Hanxi ZHU Ikuo YOSHIHARA Kunihito YAMAMORI Moritoshi YASUNAGA
We have developed Multi-modal Neural Networks (MNN) to improve the accuracy of symbolic sequence pattern classification. The basic structure of the MNN is composed of several sub-classifiers using neural networks and a decision unit. Two types of the MNN are proposed: a primary MNN and a twofold MNN. In the primary MNN, the sub-classifier is composed of a conventional three-layer neural network. The decision unit uses the majority decision to produce the final decisions from the outputs of the sub-classifiers. In the twofold MNN, the sub-classifier is composed of the primary MNN for partial classification. The decision unit uses a three-layer neural network to produce the final decisions. In the latter type of the MNN, since the structure of the primary MNN is folded into the sub-classifier, the basic structure of the MNN is used twice, which is the reason why we call the method twofold MNN. The MNN is validated with two benchmark tests: EPR (English Pronunciation Reasoning) and prediction of protein secondary structure. The reasoning accuracy of EPR is improved from 85.4% by using a three-layer neural network to 87.7% by using the primary MNN. In the prediction of protein secondary structure, the average accuracy is improved from 69.1% of a three-layer neural network to 74.6% by the primary MNN and 75.6% by the twofold MNN. The prediction test is based on a database of 126 non-homologous protein sequences.