The search functionality is under construction.
The search functionality is under construction.

IEICE TRANSACTIONS on Information

  • Impact Factor

    0.59

  • Eigenfactor

    0.002

  • article influence

    0.1

  • Cite Score

    1.4

Advance publication (published online immediately after acceptance)

Volume E92-D No.9  (Publication Date:2009/09/01)

    Regular Section
  • Approximate Nearest Neighbor Search for a Dataset of Normalized Vectors

    Kengo TERASAWA  Yuzuru TANAKA  

     
    PAPER-Algorithm Theory

      Page(s):
    1609-1619

    This paper describes a novel algorithm for approximate nearest neighbor searching. For solving this problem especially in high dimensional spaces, one of the best-known algorithm is Locality-Sensitive Hashing (LSH). This paper presents a variant of the LSH algorithm that outperforms previously proposed methods when the dataset consists of vectors normalized to unit length, which is often the case in pattern recognition. The LSH scheme is based on a family of hash functions that preserves the locality of points. This paper points out that for our special case problem we can design efficient hash functions that map a point on the hypersphere into the closest vertex of the randomly rotated regular polytope. The computational analysis confirmed that the proposed method could improve the exponent ρ, the main indicator of the performance of the LSH algorithm. The practical experiments also supported the efficiency of our algorithm both in time and in space.

  • The Online Graph Exploration Problem on Restricted Graphs

    Shuichi MIYAZAKI  Naoyuki MORIMOTO  Yasuo OKABE  

     
    PAPER-Algorithm Theory

      Page(s):
    1620-1627

    The purpose of the online graph exploration problem is to visit all the nodes of a given graph and come back to the starting node with the minimum total traverse cost. However, unlike the classical Traveling Salesperson Problem, information of the graph is given online. When an online algorithm (called a searcher) visits a node v, then it learns information on nodes and edges adjacent to v. The searcher must decide which node to visit next depending on partial and incomplete information of the graph that it has gained in its searching process. The goodness of the algorithm is evaluated by the competitive analysis. If input graphs to be explored are restricted to trees, the depth-first search always returns an optimal tour. However, if graphs have cycles, the problem is non-trivial. In this paper we consider two simple cases. First, we treat the problem on simple cycles. Recently, Asahiro et al. proved that there is a 1.5-competitive online algorithm, while no online algorithm can be (1.25-ε)-competitive for any positive constant ε. In this paper, we give an optimal online algorithm for this problem; namely, we give a (1.366)-competitive algorithm, and prove that there is no (-ε)-competitive algorithm for any positive constant ε. Furthermore, we consider the problem on unweighted graphs. We also give an optimal result; namely we give a 2-competitive algorithm and prove that there is no (2-ε)-competitive online algorithm for any positive constant ε.

  • Efficient MRC-Based Residue to Binary Converters for the New Moduli Sets {22n, 2n -1, 2n+1 -1} and {22n, 2n -1, 2n-1 -1}

    Amir Sabbagh MOLAHOSSEINI  Chitra DADKHAH  Keivan NAVI  Mohammad ESHGHI  

     
    PAPER-Computer Systems

      Page(s):
    1628-1638

    In this paper, the new residue number system (RNS) moduli sets {22n, 2n -1, 2n+1 -1} and {22n, 2n -1, 2n-1 -1} are introduced. These moduli sets have 4n-bit dynamic range and well-formed moduli which can result in high-performance residue to binary converters as well as efficient RNS arithmetic unit. Next, efficient residue to binary converters for the proposed moduli sets based on mixed-radix conversion (MRC) algorithm are presented. The converters are ROM-free and they are realized using carry-save adders and modulo adders. Comparison with the other residue to binary converters for 4n-bit dynamic range moduli sets shown that the presented designs based on new moduli sets {22n, 2n -1, 2n+1 -1} and {22n, 2n -1, 2n-1 -1} are improved the conversion delay and result in hardware savings. Also, the proposed moduli sets can lead to efficient binary to residue converters, and they can speed-up internal RNS arithmetic processing, compared with the other 4n-bit dynamic range moduli sets.

  • Effects of Data Scrubbing on Reliability in Storage Systems

    Junkil RYU  Chanik PARK  

     
    PAPER-Computer Systems

      Page(s):
    1639-1649

    Silent data corruptions, which are induced by latent sector errors, phantom writes, DMA parity errors and so on, can be detected by explicitly issuing a read command to a disk controller and comparing the corresponding data with their checksums. Because some of the data stored in a storage system may not be accessed for a long time, there is a high chance of silent data corruption occurring undetected, resulting in data loss. Therefore, periodic checking of the entire data in a storage system, known as data scrubbing, is essential to detect such silent data corruptions in time. The errors detected by data scrubbing will be recovered by the replica or the redundant information maintained to protect against permanent data loss. The longer the period between data scrubbings, the higher the probability of a permanent data loss. This paper proposes a Markov failure and repair model to conservatively analyze the effect of data scrubbing on the reliability of a storage system. We show the relationship between the period of a data scrubbing operation and the number of data replicas to manage the reliability of a storage system by using the proposed model.

  • Code Compression with Split Echo Instructions

    Iver STUBDAL  Arda KARADUMAN  Hideharu AMANO  

     
    PAPER-Fundamentals of Software and Theory of Programs

      Page(s):
    1650-1656

    Code density is often a critical issue in embedded computers, since the memory size of embedded systems is strictly limited. Echo instructions have been proposed as a method for reducing code size. This paper presents a new type of echo instruction, split echo, and evaluates an implementation of both split echo and traditional echo instructions on a MIPS R3000 based processor. Evaluation results show that memory requirement is reduced by 12% on average with small additional hardware cost.

  • Content-Based Retrieval of Motion Capture Data Using Short-Term Feature Extraction

    Jianfeng XU  Haruhisa KATO  Akio YONEYAMA  

     
    PAPER-Contents Technology and Web Information Systems

      Page(s):
    1657-1667

    This paper presents a content-based retrieval algorithm for motion capture data, which is required to re-use a large-scale database that has many variations in the same category of motions. The most challenging problem is that logically similar motions may not be numerically similar due to the motion variations in a category. Our algorithm can effectively retrieve logically similar motions to a query, where a distance metric between our novel short-term features is defined properly as a fundamental component in our system. We extract the features based on short-term analysis of joint velocities after dividing an entire motion capture sequence into many small overlapped clips. In each clip, we select not only the magnitude but also the dynamic pattern of the joint velocities as our features, which can discard the motion variations while keeping the significant motion information in a category. Simultaneously, the amount of data is reduced, alleviating the computational cost. Using the extracted features, we define a novel distance metric between two motion clips. By dynamic time warping, a motion dissimilarity measure is calculated between two motion capture sequences. Then, given a query, we rank all the motions in our dataset according to their motion dissimilarity measures. Our experiments, which are performed on a test dataset consisting of more than 190 motions, demonstrate that our algorithm greatly improves the performance compared to two conventional methods according to a popular evaluation measure P(NR).

  • A New Clustering Validity Index for Cluster Analysis Based on a Two-Level SOM

    Shu-Ling SHIEH  I-En LIAO  

     
    PAPER-Data Mining

      Page(s):
    1668-1674

    Self-Organizing Map (SOM) is a powerful tool for the exploratory of clustering methods. Clustering is the most important task in unsupervised learning and clustering validity is a major issue in cluster analysis. In this paper, a new clustering validity index is proposed to generate the clustering result of a two-level SOM. This is performed by using the separation rate of inter-cluster, the relative density of inter-cluster, and the cohesion rate of intra-cluster. The clustering validity index is proposed to find the optimal numbers of clusters and determine which two neighboring clusters can be merged in a hierarchical clustering of a two-level SOM. Experiments show that, the proposed algorithm is able to cluster data more accurately than the classical clustering algorithms which is based on a two-level SOM and is better able to find an optimal number of clusters by maximizing the clustering validity index.

  • Multipath Routing with Reliable Nodes in Large-Scale Mobile Ad-Hoc Networks

    Yun GE  Guojun WANG  Qing ZHANG  Minyi GUO  

     
    PAPER-Networks

      Page(s):
    1675-1682

    We propose a Multiple Zones-based (M-Zone) routing protocol to discover node-disjoint multiplath routing efficiently and effectively in large-scale MANETs. Compared with single path routing, multipath routing can improve robustness, load balancing and throughput of a network. However, it is very difficult to achieve node-disjoint multipath routing in large-scale MANETs. To ensure finding node-disjoint multiple paths, the M-Zone protocol divides the region between a source and a destination into multiple zones based on geographical location and each path is mapped to a distinct zone. Performance analysis shows that M-Zone has good stability, and the control complexity and storage complexity of M-Zone are lower than those of the well-known AODVM protocol. Simulation studies show that the average end-to-end delay of M-Zone is lower than that of AODVM and the routing overhead of M-Zone is less than that of AODVM.

  • A Comparison of Pressure and Tilt Input Techniques for Cursor Control

    Xiaolei ZHOU  Xiangshi REN  

     
    PAPER-Human-computer Interaction

      Page(s):
    1683-1691

    Three experiments were conducted in this study to investigate the human ability to control pen pressure and pen tilt input, by coupling this control with cursor position, angle and scale. Comparisons between pen pressure input and pen tilt input have been made in the three experiments. Experimental results show that decreasing pressure input resulted in very poor performance and was not a good input technique for any of the three experiments. In "Experiment 1-Coupling to Cursor Position", the tilt input technique performed relatively better than the increasing pressure input technique in terms of time, even though the tilt technique had a slightly higher error rate. In "Experiment 2-Coupling to Cursor Angle", the tilt input performed a little better than the increasing pressure input in terms of time, but the gap between them is not so apparent as Experiment 1. In "Experiment 3-Coupling to Cursor Scale", tilt input performed a little better than increasing pressure input in terms of adjustment time. Based on the results of our experiments, we have inferred several design implications and guidelines.

  • Revision of Using Eigenvalues of Covariance Matrices in Boundary-Based Corner Detection

    Wen-Bing HORNG  Chun-Wen CHEN  

     
    PAPER-Pattern Recognition

      Page(s):
    1692-1701

    In this paper, we present a revision of using eigenvalues of covariance matrices proposed by Tsai et al. as a measure of significance (i.e., curvature) for boundary-based corner detection. We first show the pitfall of Tsai et al.'s approach. We then further investigate the properties of eigenvalues of covariance matrices of three different types of curves and point out a mistake made by Tsai et al.'s method. Finally, we propose a modification of using eigenvalues as a measure of significance for corner detection to remedy their defect. The experiment results show that under the same conditions of the test patterns, in addition to correctly detecting all true corners, the spurious corners detected by Tsai et al.'s method disappear in our modified measure of significance.

  • Spectral Fluctuation Method: A Texture-Based Method to Extract Text Regions in General Scene Images

    Yoichiro BABA  Akira HIROSE  

     
    PAPER-Pattern Recognition

      Page(s):
    1702-1715

    To obtain text information included in a scene image, we first need to extract text regions from the image before recognizing the text. In this paper, we examine human vision and propose a novel method to extract text regions by evaluating textural variation. Human beings are often attracted by textural variation in scenes, which causes foveation. We frame a hypothesis that texts also have similar property that distinguishes them from the natural background. In our method, we calculate spatial variation of texture to obtain the distribution of the degree of likelihood of text region. Here we evaluate the changes in local spatial spectrum as the textural variation. We investigate two options to evaluate the spectrum, that is, those based on one- and two-dimensional Fourier transforms. In particular, in this paper, we put emphasis on the one-dimensional transform, which functions like the Gabor filter. The proposal can be applied to a wide range of characters mainly because it employs neither templates nor heuristics concerning character size, aspect ratio, specific direction, alignment, and so on. We demonstrate that the method effectively extracts text regions contained in various general scene images. We present quantitative evaluation of the method by using databases open to the public.

  • An LVCSR Based Reading Miscue Detection System Using Knowledge of Reference and Error Patterns

    Changliang LIU  Fuping PAN  Fengpei GE  Bin DONG  Hongbin SUO  Yonghong YAN  

     
    PAPER-Speech and Hearing

      Page(s):
    1716-1724

    This paper describes a reading miscue detection system based on the conventional Large Vocabulary Continuous Speech Recognition (LVCSR) framework [1]. In order to incorporate the knowledge of reference (what the reader ought to read) and some error patterns into the decoding process, two methods are proposed: Dynamic Multiple Pronunciation Incorporation (DMPI) and Dynamic Interpolation of Language Model (DILM). DMPI dynamically adds some pronunciation variations into the search space to predict reading substitutions and insertions. To resolve the conflict between the coverage of error predications and the perplexity of the search space, only the pronunciation variants related to the reference are added. DILM dynamically interpolates the general language model based on the analysis of the reference and so keeps the active paths of decoding relatively near the reference. It makes the recognition more accurate, which further improves the detection performance. At the final stage of detection, an improved dynamic program (DP) is used to align the confusion network (CN) from speech recognition and the reference to generate the detecting result. The experimental results show that the proposed two methods can decrease the Equal Error Rate (EER) by 14% relatively, from 46.4% to 39.8%.

  • Range and Size Estimation Based on a Coordinate Transformation Model for Driving Assistance Systems

    Bing-Fei WU  Chuan-Tsai LIN  Yen-Lin CHEN  

     
    PAPER-Image Recognition, Computer Vision

      Page(s):
    1725-1735

    This paper presents new approaches for the estimation of range between the preceding vehicle and the experimental vehicle, estimation of vehicle size and its projective size, and dynamic camera calibration. First, our proposed approaches adopt a camera model to transform coordinates from the ground plane onto the image plane to estimate the relative position between the detected vehicle and the camera. Then, to estimate the actual and projective size of the preceding vehicle, we propose a new estimation method. This method can estimate the range from a preceding vehicle to the camera based on contact points between its tires and the ground and then estimate the actual size of the vehicle according to the positions of its vertexes in the image. Because the projective size of a vehicle varies with respect to its distance to the camera, we also present a simple and rapid method of estimating a vehicle's projective height, which allows a reduction in computational time for size estimation in real-time systems. Errors caused by the application of different camera parameters are also estimated and analyzed in this study. The estimation results are used to determine suitable parameters during camera installation to suppress estimation errors. Finally, to guarantee robustness of the detection system, a new efficient approach to dynamic calibration is presented to obtain accurate camera parameters, even when they are changed by camera vibration owing to on-road driving. Experimental results demonstrate that our approaches can provide accurate and robust estimation results of range and size of target vehicles.

  • A New Approach to Rotation Invariant Texture Analysis Based on Radon Transform

    Mehdi CHEHEL AMIRANI  Ali A. BEHESHTI SHIRAZI  

     
    PAPER-Image Recognition, Computer Vision

      Page(s):
    1736-1744

    In this paper, we propose a new approach to rotation invariant texture analysis. This method uses the Radon transform with some considerations in direction estimation of textural images. Furthermore, it utilizes the information obtained from the number of peaks in the variance array of the Radon transform as a realty feature. The textural features are then generated after rotation of texture along principle direction. Also, to eliminating the introduced error due to rotation of texture, a simple technique is presented. Experimental results on a set of images from the Brodatz album show a good performance achieved by the proposed method in comparison with some recent texture analysis methods.

  • Local Image Descriptors Using Supervised Kernel ICA

    Masaki YAMAZAKI  Sidney FELS  

     
    PAPER-Image Recognition, Computer Vision

      Page(s):
    1745-1751

    PCA-SIFT is an extension to SIFT which aims to reduce SIFT's high dimensionality (128 dimensions) by applying PCA to the gradient image patches. However PCA is not a discriminative representation for recognition due to its global feature nature and unsupervised algorithm. In addition, linear methods such as PCA and ICA can fail in the case of non-linearity. In this paper, we propose a new discriminative method called Supervised Kernel ICA (SKICA) that uses a non-linear kernel approach combined with Supervised ICA-based local image descriptors. Our approach blends the advantages of supervised learning with nonlinear properties of kernels. Using five different test data sets we show that the SKICA descriptors produce better object recognition performance than other related approaches with the same dimensionality. The SKICA-based representation has local sensitivity, non-linear independence and high class separability providing an effective method for local image descriptors.

  • Development of an Interactive Augmented Environment and Its Application to Autonomous Learning for Quadruped Robots

    Hayato KOBAYASHI  Tsugutoyo OSAKI  Tetsuro OKUYAMA  Joshua GRAMM  Akira ISHINO  Ayumi SHINOHARA  

     
    PAPER-Multimedia Pattern Processing

      Page(s):
    1752-1761

    This paper describes an interactive experimental environment for autonomous soccer robots, which is a soccer field augmented by utilizing camera input and projector output. This environment, in a sense, plays an intermediate role between simulated environments and real environments. We can simulate some parts of real environments, e.g., real objects such as robots or a ball, and reflect simulated data into the real environments, e.g., to visualize the positions on the field, so as to create a situation that allows easy debugging of robot programs. The significant point compared with analogous work is that virtual objects are touchable in this system owing to projectors. We also show the portable version of our system that does not require ceiling cameras. As an application in the augmented environment, we address the learning of goalie strategies on real quadruped robots in penalty kicks. We make our robots utilize virtual balls in order to perform only quadruped locomotion in real environments, which is quite difficult to simulate accurately. Our robots autonomously learn and acquire more beneficial strategies without human intervention in our augmented environment than those in a fully simulated environment.

  • Imposing Constraints from the Source Tree on ITG Constraints for SMT

    Hirofumi YAMAMOTO  Hideo OKUMA  Eiichiro SUMITA  

     
    PAPER-Natural Language Processing

      Page(s):
    1762-1770

    In the current statistical machine translation (SMT), erroneous word reordering is one of the most serious problems. To resolve this problem, many word-reordering constraint techniques have been proposed. Inversion transduction grammar (ITG) is one of these constraints. In ITG constraints, target-side word order is obtained by rotating nodes of the source-side binary tree. In these node rotations, the source binary tree instance is not considered. Therefore, stronger constraints for word reordering can be obtained by imposing further constraints derived from the source tree on the ITG constraints. For example, for the source word sequence { a b c d }, ITG constraints allow a total of twenty-two target word orderings. However, when the source binary tree instance ((a b) (c d)) is given, our proposed "imposing source tree on ITG" (IST-ITG) constraints allow only eight word orderings. The reduction in the number of word-order permutations by our proposed stronger constraints efficiently suppresses erroneous word orderings. In our experiments with IST-ITG using the NIST MT08 English-to-Chinese translation track's data, the proposed method resulted in a 1.8-points improvement in character BLEU-4 (35.2 to 37.0) and a 6.2% lower CER (74.1 to 67.9%) compared with our baseline condition.

  • Ranking Multiple Dialogue States by Corpus Statistics to Improve Discourse Understanding in Spoken Dialogue Systems

    Ryuichiro HIGASHINAKA  Mikio NAKANO  

     
    PAPER-Natural Language Processing

      Page(s):
    1771-1782

    This paper discusses the discourse understanding process in spoken dialogue systems. This process enables a system to understand user utterances from the context of a dialogue. Ambiguity in user utterances caused by multiple speech recognition hypotheses and parsing results sometimes makes it difficult for a system to decide on a single interpretation of a user intention. As a solution, the idea of retaining possible interpretations as multiple dialogue states and resolving the ambiguity using succeeding user utterances has been proposed. Although this approach has proven to improve discourse understanding accuracy, carefully created hand-crafted rules are necessary in order to accurately rank the dialogue states. This paper proposes automatically ranking multiple dialogue states using statistical information obtained from dialogue corpora. The experimental results in the train ticket reservation and weather information service domains show that the statistical information can significantly improve the ranking accuracy of dialogue states as well as the slot accuracy and the concept error rate of the top-ranked dialogue states.

  • Study on Entropy and Similarity Measure for Fuzzy Set

    Sang-Hyuk LEE  Keun Ho RYU  Gyoyong SOHN  

     
    LETTER-Computation and Computational Models

      Page(s):
    1783-1786

    In this study, we investigated the relationship between similarity measures and entropy for fuzzy sets. First, we developed fuzzy entropy by using the distance measure for fuzzy sets. We pointed out that the distance between the fuzzy set and the corresponding crisp set equals fuzzy entropy. We also found that the sum of the similarity measure and the entropy between the fuzzy set and the corresponding crisp set constitutes the total information in the fuzzy set. Finally, we derived a similarity measure from entropy and showed by a simple example that the maximum similarity measure can be obtained using a minimum entropy formulation.

  • Efficient Predicate Matching over Continuous Data Streams

    Hyeon-Gyu KIM  Woo-Lam KANG  Yoon-Joon LEE  Myoung-Ho KIM  

     
    LETTER-Database

      Page(s):
    1787-1790

    In this paper, we propose a predicate indexing method which handles equality and inequality tests separately. Our method uses a hash table for the equality test and a balanced binary search tree for the inequality test. Such a separate structure reduces a height of the search tree and the number of comparisons per tree node, as well as the cost for tree rebalancing. We compared our method with the IBS-tree which is one of the popular indexing methods suitable for data stream processing. Our experimental results show that the proposed method provides better insertion and search performances than the IBS-tree.

  • Implementation of Both High-Speed Transmission and Quality of System for Internet Protocol Multicasting Services

    Byounghee SON  Youngchoong PARK  Euiseok NAHM  

     
    LETTER-Networks

      Page(s):
    1791-1793

    The paper introduces both high-speed transmission and quality of system to offer the Internet services on a HFC (Hybrid Fiber Coaxial) network. This utilizes modulating the phase and the amplitude to the signal of the IPMS (Internet Protocol Multicasting Service). An IP-cable transmitter, IP-cable modem, and IP-cable management servers that support 30-Mbps IPMS on the HFC were developed. The system provides a 21 Mbps HDTV transporting stream on a cable TV network. It can sustain a clear screen for a long time.

  • Robust Relative Transfer Function Estimation for Dual Microphone-Based Generalized Sidelobe Canceller

    Kihyeon KIM  Hanseok KO  

     
    LETTER-Speech and Hearing

      Page(s):
    1794-1797

    In this Letter, a robust system identification method is proposed for the generalized sidelobe canceller using dual microphones. The conventional transfer-function generalized sidelobe canceller employs the non-stationarity characteristics of the speech signal to estimate the relative transfer function and thus is difficult to apply when the noise is also non-stationary. Under the assumption of W-disjoint orthogonality between the speech and the non-stationary noise, the proposed algorithm finds the speech-dominant time-frequency bins of the input signal by inspecting the system output and the inter-microphone time delay. Only these bins are used to estimate the relative transfer function, so reliable estimates can be obtained under non-stationary noise conditions. The experimental results show that the proposed algorithm significantly improves the performance of the transfer-function generalized sidelobe canceller, while only sustaining a modest estimation error in adverse non-stationary noise environments.

  • Approximate Decision Function and Optimization for GMM-UBM Based Speaker Verification

    Xiang XIAO  Xiang ZHANG  Haipeng WANG  Hongbin SUO  Qingwei ZHAO  Yonghong YAN  

     
    LETTER-Speech and Hearing

      Page(s):
    1798-1802

    The GMM-UBM framework has been proved to be one of the most effective approaches to the automatic speaker verification (ASV) task in recent years. In this letter, we first propose an approximate decision function of traditional GMM-UBM, from which it is shown that the contribution to classification of each Gaussian component is equally important. However, research in speaker perception shows that a different speech sound unit defined by Gaussian component makes a different contribution to speaker verification. This motivates us to emphasize some sound units which have discriminability between speakers while de-emphasize the speech sound units which contain little information for speaker verification. Experiments on 2006 NIST SRE core task show that the proposed approach outperforms traditional GMM-UBM approach in classification accuracy.

  • Color Image Retrieval Based on Distance-Weighted Boundary Predictive Vector Quantization Index Histograms

    Zhen SUN  Zhe-Ming LU  Hao LUO  

     
    LETTER-Image Processing and Video Processing

      Page(s):
    1803-1806

    This Letter proposes a new kind of features for color image retrieval based on Distance-weighted Boundary Predictive Vector Quantization (DWBPVQ) Index Histograms. For each color image in the database, 6 histograms (2 for each color component) are calculated from the six corresponding DWBPVQ index sequences. The retrieval simulation results show that, compared with the traditional Spatial-domain Color-Histogram-based (SCH) features and the DCTVQ index histogram-based (DCTVQIH) features, the proposed DWBPVQIH features can greatly improve the recall and precision performance.

  • Threshold Selection Based on Interval-Valued Fuzzy Sets

    Chang Sik SON  Suk Tae SEO  In Keun LEE  Hye Cheun JEONG  Soon Hak KWON  

     
    LETTER-Image Recognition, Computer Vision

      Page(s):
    1807-1810

    We propose a thresholding method based on interval-valued fuzzy sets which are used to define the grade of a gray level belonging to one of the two classes, an object and the background of an image. The effectiveness of the proposed method is demonstrated by comparing our classification results on eight test images to results from the conventional methods.

  • Natural Scene Classification Based on Integrated Topic Simplex

    Tang YINGJUN  Xu DE  Yang XU  Liu QIFANG  

     
    LETTER-Image Recognition, Computer Vision

      Page(s):
    1811-1814

    We present a novel model named Integrated Latent Topic Model (ILTM), to learn and recognize natural scene category. Unlike previous work, which considered the discrepancy and common property separately among all categories, Our approach combines universal topics from all categories with specific topics from each category. As a result, the model is implemented to produce a few but specific topics and more generic topics among categories, and each category is represented in a different topics simplex, which correlates well with human scene understanding. We investigate the classification performance with variable scene category tasks. The experiments have shown our model outperforms latent-space methods with less training data.

  • Fusion of Multiple Facial Features for Age Estimation

    Li LU  Pengfei SHI  

     
    LETTER-Image Recognition, Computer Vision

      Page(s):
    1815-1818

    A novel age estimation method is presented which improves performance by fusing complementary information acquired from global and local features of the face. Two-directional two-dimensional principal component analysis ((2D)2PCA) is used for dimensionality reduction and construction of individual feature spaces. Each feature space contributes a confidence value which is calculated by Support vector machines (SVMs). The confidence values of all the facial features are then fused for final age estimation. Experimental results demonstrate that fusing multiple facial features can achieve significant accuracy gains over any single feature. Finally, we propose a fusion method that further improves accuracy.