The search functionality is under construction.

Author Search Result

[Author] Hua XIAO(9hit)

1-9hit
  • Learning Convolutional Domain-Robust Representations for Cross-View Face Recognition

    Xue CHEN  Chunheng WANG  Baihua XIAO  Song GAO  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2014/09/08
      Vol:
    E97-D No:12
      Page(s):
    3239-3243

    This paper proposes to obtain high-level, domain-robust representations for cross-view face recognition. Specially, we introduce Convolutional Deep Belief Networks (CDBN) as the feature learning model, and an CDBN based interpolating path between the source and target views is built to model the correlation of cross-view data. The promising results outperform other state-of-the-art methods.

  • Nonlinear Metric Learning with Deep Independent Subspace Analysis Network for Face Verification

    Xinyuan CAI  Chunheng WANG  Baihua XIAO  Yunxue SHAO  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E96-D No:12
      Page(s):
    2830-2838

    Face verification is the task of determining whether two given face images represent the same person or not. It is a very challenging task, as the face images, captured in the uncontrolled environments, may have large variations in illumination, expression, pose, background, etc. The crucial problem is how to compute the similarity of two face images. Metric learning has provided a viable solution to this problem. Until now, many metric learning algorithms have been proposed, but they are usually limited to learning a linear transformation. In this paper, we propose a nonlinear metric learning method, which learns an explicit mapping from the original space to an optimal subspace using deep Independent Subspace Analysis (ISA) network. Compared to the linear or kernel based metric learning methods, the proposed deep ISA network is a deep and local learning architecture, and therefore exhibits more powerful ability to learn the nature of highly variable dataset. We evaluate our method on the Labeled Faces in the Wild dataset, and results show superior performance over some state-of-the-art methods.

  • Learning Co-occurrence of Local Spatial Strokes for Robust Character Recognition

    Song GAO  Chunheng WANG  Baihua XIAO  Cunzhao SHI  Wen ZHOU  Zhong ZHANG  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E97-D No:7
      Page(s):
    1937-1941

    In this paper, we propose a representation method based on local spatial strokes for scene character recognition. High-level semantic information, namely co-occurrence of several strokes is incorporated by learning a sparse dictionary, which can further restrain noise brought by single stroke detectors. The encouraging results outperform state-of-the-art algorithms.

  • Scene Text Character Recognition Using Spatiality Embedded Dictionary

    Song GAO  Chunheng WANG  Baihua XIAO  Cunzhao SHI  Wen ZHOU  Zhong ZHANG  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E97-D No:7
      Page(s):
    1942-1946

    This paper tries to model spatial layout beyond the traditional spatial pyramid (SP) in the coding/pooling scheme for scene text character recognition. Specifically, we propose a novel method to build a dictionary called spatiality embedded dictionary (SED) in which each codeword represents a particular character stroke and is associated with a local response region. The promising results outperform other state-of-the-art algorithms.

  • Point-Manifold Discriminant Analysis for Still-to-Video Face Recognition

    Xue CHEN  Chunheng WANG  Baihua XIAO  Yunxue SHAO  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E97-D No:10
      Page(s):
    2780-2789

    In Still-to-Video (S2V) face recognition, only a few high resolution images are registered for each subject, while the probe is video clips of complex variations. As faces present distinct characteristics under different scenarios, recognition in the original space is obviously inefficient. Thus, in this paper, we propose a novel discriminant analysis method to learn separate mappings for different scenario patterns (still, video), and further pursue a common discriminant space based on these mappings. Concretely, by modeling each video as a manifold and each image as point data, we form the scenario-oriented mapping learning as a Point-Manifold Discriminant Analysis (PMDA) framework. The learning objective is formulated by incorporating the intra-class compactness and inter-class separability for good discrimination. Experiments on the COX-S2V dataset demonstrate the effectiveness of the proposed method.

  • Learning Discriminative Features for Ground-Based Cloud Classification via Mutual Information Maximization

    Shuang LIU  Zhong ZHANG  Baihua XIAO  Xiaozhong CAO  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2015/03/24
      Vol:
    E98-D No:7
      Page(s):
    1422-1425

    Texture feature descriptors such as local binary patterns (LBP) have proven effective for ground-based cloud classification. Traditionally, these texture feature descriptors are predefined in a handcrafted way. In this paper, we propose a novel method which automatically learns discriminative features from labeled samples for ground-based cloud classification. Our key idea is to learn these features through mutual information maximization which learns a transformation matrix for local difference vectors of LBP. The experimental results show that our learned features greatly improves the performance of ground-based cloud classification when compared to the other state-of-the-art methods.

  • Construction of Frequency-Hopping/Time-Spreading Two-Dimensional Optical Codes Using Quadratic and Cubic Congruence Code

    Chongfu ZHANG  Kun QIU  Yu XIANG  Hua XIAO  

     
    PAPER-Fundamental Theories for Communications

      Vol:
    E94-B No:7
      Page(s):
    1883-1891

    Quadratic congruence code (QCC)-based frequency-hopping and time-spreading (FH/TS) optical orthogonal codes (OOCs), and the corresponding expanded cardinality were recently studied to improve data throughput and code capacity. In this paper, we propose a new FH/TS two-dimensional (2-D) code using the QCC and the cubic congruence code (CCC), named as the QCC/CCC 2-D code. Additionally the expanded CCC-based 2D codes are also considered. In contrast to the conventional QCC-based 1-D and QCC-based FH/TS 2-D optical codes, our analysis indicates that the code capacity of the CCC-based 1-D and CCC-based FH/TS 2-D codes can be improved with the same code weight and length, respectively.

  • Modeling Interactions between Low-Level and High-Level Features for Human Action Recognition

    Wen ZHOU  Chunheng WANG  Baihua XIAO  Zhong ZHANG  Yunxue SHAO  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E96-D No:12
      Page(s):
    2896-2899

    Recognizing human action in complex scenes is a challenging problem in computer vision. Some action-unrelated concepts, such as camera position features, could significantly affect the appearance of local spatio-temporal features, and therefore the performance of low-level features based methods degrades. In this letter, we define the action-unrelated concept: the position of camera as high-level features. We observe that they can serve as a prior to local spatio-temporal features for human action recognition. We encode this prior by modeling interactions between spatio-temporal features and camera position features. We infer camera position features from local spatio-temporal features via these interactions. The parameters of this model are estimated by a new max-margin algorithm. We evaluate the proposed method on KTH, IXMAS and Youtube actions datasets. Experimental results show the effectiveness of the proposed method.

  • A Robust Sound Source Localization Approach for Microphone Array with Model Errors

    Hua XIAO  Huai-Zong SHAO  Qi-Cong PENG  

     
    PAPER-Speech and Hearing

      Vol:
    E91-A No:8
      Page(s):
    2062-2067

    In this paper, a robust sound source localization approach is proposed. The approach retains good performance even when model errors exist. Compared with previous work in this field, the contributions of this paper are as follows. First, an improved broad-band and near-field array model is proposed. It takes array gain, phase perturbations into account and is based on the actual positions of the elements. It can be used in arbitrary planar geometry arrays. Second, a subspace model errors estimation algorithm and a Weighted 2-Dimension Multiple Signal Classification (W2D-MUSIC) algorithm are proposed. The subspace model errors estimation algorithm estimates unknown parameters of the array model, i.e., gain, phase perturbations, and positions of the elements, with high accuracy. The performance of this algorithm is improved with the increasing of SNR or number of snapshots. The W2D-MUSIC algorithm based on the improved array model is implemented to locate sound sources. These two algorithms compose the robust sound source approach. The more accurate steering vectors can be provided for further processing such as adaptive beamforming algorithm. Numerical examples confirm effectiveness of this proposed approach.