The search functionality is under construction.

Author Search Result

[Author] Zheng MA(8hit)

1-8hit
  • Pre-Processing for Fine-Grained Image Classification

    Hao GE  Feng YANG  Xiaoguang TU  Mei XIE  Zheng MA  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2017/05/12
      Vol:
    E100-D No:8
      Page(s):
    1938-1942

    Recently, numerous methods have been proposed to tackle the problem of fine-grained image classification. However, rare of them focus on the pre-processing step of image alignment. In this paper, we propose a new pre-processing method with the aim of reducing the variance of objects among the same class. As a result, the variance of objects between different classes will be more significant. The proposed approach consists of four procedures. The “parts” of the objects are firstly located. After that, the rotation angle and the bounding box could be obtained based on the spatial relationship of the “parts”. Finally, all the images are resized to similar sizes. The objects in the images possess the properties of translation, scale and rotation invariance after processed by the proposed method. Experiments on the CUB-200-2011 and CUB-200-2010 datasets have demonstrated that the proposed method could boost the recognition performance by serving as a pre-processing step of several popular classification algorithms.

  • Visual Recognition Method Based on Hybrid KPCA Network

    Feng YANG  Zheng MA  Mei XIE  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2020/05/28
      Vol:
    E103-D No:9
      Page(s):
    2015-2018

    In this paper, we propose a deep model of visual recognition based on hybrid KPCA Network(H-KPCANet), which is based on the combination of one-stage KPCANet and two-stage KPCANet. The proposed model consists of four types of basic components: the input layer, one-stage KPCANet, two-stage KPCANet and the fusion layer. The role of one-stage KPCANet is to calculate the KPCA filters for convolution layer, and two-stage KPCANet is to learn PCA filters in the first stage and KPCA filters in the second stage. After binary quantization mapping and block-wise histogram, the features from two different types of KPCANets are fused in the fusion layer. The final feature of the input image can be achieved by weighted serial combination of the two types of features. The performance of our proposed algorithm is tested on digit recognition and object classification, and the experimental results on visual recognition benchmarks of MNIST and CIFAR-10 validated the performance of the proposed H-KPCANet.

  • Illumination Normalization for Face Recognition Using Energy Minimization Framework

    Xiaoguang TU  Feng YANG  Mei XIE  Zheng MA  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2017/03/10
      Vol:
    E100-D No:6
      Page(s):
    1376-1379

    Numerous methods have been developed to handle lighting variations in the preprocessing step of face recognition. However, most of them only use the high-frequency information (edges, lines, corner, etc.) for recognition, as pixels lied in these areas have higher local variance values, and thus insensitive to illumination variations. In this case, information of low-frequency may be discarded and some of the features which are helpful for recognition may be ignored. In this paper, we present a new and efficient method for illumination normalization using an energy minimization framework. The proposed method aims to remove the illumination field of the observed face images while simultaneously preserving the intrinsic facial features. The normalized face image and illumination field could be achieved by a reciprocal iteration scheme. Experiments on CMU-PIE and the Extended Yale B databases show that the proposed method can preserve a very good visual quality even on the images illuminated with deep shadow and high brightness regions, and obtain promising illumination normalization results for better face recognition performance.

  • Codebook Learning for Image Recognition Based on Parallel Key SIFT Analysis

    Feng YANG  Zheng MA  Mei XIE  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2017/01/10
      Vol:
    E100-D No:4
      Page(s):
    927-930

    The quality of codebook is very important in visual image classification. In order to boost the classification performance, a scheme of codebook generation for scene image recognition based on parallel key SIFT analysis (PKSA) is presented in this paper. The method iteratively applies classical k-means clustering algorithm and similarity analysis to evaluate key SIFT descriptors (KSDs) from the input images, and generates the codebook by a relaxed k-means algorithm according to the set of KSDs. With the purpose of evaluating the performance of the PKSA scheme, the image feature vector is calculated by sparse code with Spatial Pyramid Matching (ScSPM) after the codebook is constructed. The PKSA-based ScSPM method is tested and compared on three public scene image datasets. The experimental results show the proposed scheme of PKSA can significantly save computational time and enhance categorization rate.

  • Action Recognition Using Weighted Locality-Constrained Linear Coding

    Jiangfeng YANG  Zheng MA  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2014/10/31
      Vol:
    E98-D No:2
      Page(s):
    462-466

    Recently, locality-constrained linear coding (LLC) as a coding strategy has attracted much attention, due to its better reconstruction than sparse coding and vector quantization. However, LLC ignores the weight information of codewords during the coding stage, and assumes that every selected base has same credibility, even if their weights are different. To further improve the discriminative power of LLC code, we propose a weighted LLC algorithm that considers the codeword weight information. Experiments on the KTH and UCF datasets show that the recognition system based on WLLC achieves better performance than that based on the classical LLC and VQ, and outperforms the recent classical systems.

  • Mixture Hyperplanes Approximation for Global Tracking

    Song GU  Zheng MA  Mei XIE  

     
    LETTER-Pattern Recognition

      Pubricized:
    2015/08/13
      Vol:
    E98-D No:11
      Page(s):
    2008-2012

    Template tracking has been extensively studied in Computer Vision with a wide range of applications. A general framework is to construct a parametric model to predict movement and to track the target. The difference in intensity between the pixels belonging to the current region and the pixels of the selected target allows a straightforward prediction of the region position in the current image. Traditional methods track the object based on the assumption that the relationship between the intensity difference and the region position is linear or non-linear. They will result in bad tracking performance when just one model is adopted. This paper proposes a method, called as Mixture Hyperplanes Approximation, which is based on finite mixture of generalized linear regression models to perform robust tracking. Moreover, a fast learning strategy is discussed, which improves the robustness against noise. Experiments demonstrate the performance and stability of Mixture Hyperplanes Approximation.

  • Spatio-Temporal Self-Attention Weighted VLAD Neural Network for Action Recognition

    Shilei CHENG  Mei XIE  Zheng MA  Siqi LI  Song GU  Feng YANG  

     
    LETTER-Biocybernetics, Neurocomputing

      Pubricized:
    2020/10/01
      Vol:
    E104-D No:1
      Page(s):
    220-224

    As characterizing videos simultaneously from spatial and temporal cues have been shown crucial for video processing, with the shortage of temporal information of soft assignment, the vector of locally aggregated descriptor (VLAD) should be considered as a suboptimal framework for learning the spatio-temporal video representation. With the development of attention mechanisms in natural language processing, in this work, we present a novel model with VLAD following spatio-temporal self-attention operations, named spatio-temporal self-attention weighted VLAD (ST-SAWVLAD). In particular, sequential convolutional feature maps extracted from two modalities i.e., RGB and Flow are receptively fed into the self-attention module to learn soft spatio-temporal assignments parameters, which enabling aggregate not only detailed spatial information but also fine motion information from successive video frames. In experiments, we evaluate ST-SAWVLAD by using competitive action recognition datasets, UCF101 and HMDB51, the results shcoutstanding performance. The source code is available at:https://github.com/badstones/st-sawvlad.

  • A Generalized Diagonal Loading Robust Wideband Beam Pattern Synthesis Method

    ChangZheng MA  BoonPoh NG  

     
    LETTER-Digital Signal Processing

      Vol:
    E88-A No:2
      Page(s):
    590-592

    Optimum wideband beam pattern synthesis methods are usually sensitive to antenna elements gain, phase and position errors. In this letter, these errors are taken into account in a constraint optimization process, and a generalized diagonal loading algorithm is obtained. Computer simulations indicate the robustness of this new method.