IEICE global.ieice.org Site

Author Search Result

[Author] Zheng MA(8hit)

1-8hit

Pre-Processing for Fine-Grained Image Classification
Hao GE Feng YANG Xiaoguang TU Mei XIE Zheng MA

LETTER-Image Recognition, Computer Vision

Pubricized:
2017/05/12
Vol:
E100-D No:8
Page(s):
1938-1942
Recently, numerous methods have been proposed to tackle the problem of fine-grained image classification. However, rare of them focus on the pre-processing step of image alignment. In this paper, we propose a new pre-processing method with the aim of reducing the variance of objects among the same class. As a result, the variance of objects between different classes will be more significant. The proposed approach consists of four procedures. The “parts” of the objects are firstly located. After that, the rotation angle and the bounding box could be obtained based on the spatial relationship of the “parts”. Finally, all the images are resized to similar sizes. The objects in the images possess the properties of translation, scale and rotation invariance after processed by the proposed method. Experiments on the CUB-200-2011 and CUB-200-2010 datasets have demonstrated that the proposed method could boost the recognition performance by serving as a pre-processing step of several popular classification algorithms.
Visual Recognition Method Based on Hybrid KPCA Network
Feng YANG Zheng MA Mei XIE

LETTER-Image Recognition, Computer Vision

Pubricized:
2020/05/28
Vol:
E103-D No:9
Page(s):
2015-2018
In this paper, we propose a deep model of visual recognition based on hybrid KPCA Network(H-KPCANet), which is based on the combination of one-stage KPCANet and two-stage KPCANet. The proposed model consists of four types of basic components: the input layer, one-stage KPCANet, two-stage KPCANet and the fusion layer. The role of one-stage KPCANet is to calculate the KPCA filters for convolution layer, and two-stage KPCANet is to learn PCA filters in the first stage and KPCA filters in the second stage. After binary quantization mapping and block-wise histogram, the features from two different types of KPCANets are fused in the fusion layer. The final feature of the input image can be achieved by weighted serial combination of the two types of features. The performance of our proposed algorithm is tested on digit recognition and object classification, and the experimental results on visual recognition benchmarks of MNIST and CIFAR-10 validated the performance of the proposed H-KPCANet.
Illumination Normalization for Face Recognition Using Energy Minimization Framework
Xiaoguang TU Feng YANG Mei XIE Zheng MA

LETTER-Artificial Intelligence, Data Mining

Pubricized:
2017/03/10
Vol:
E100-D No:6
Page(s):
1376-1379
Numerous methods have been developed to handle lighting variations in the preprocessing step of face recognition. However, most of them only use the high-frequency information (edges, lines, corner, etc.) for recognition, as pixels lied in these areas have higher local variance values, and thus insensitive to illumination variations. In this case, information of low-frequency may be discarded and some of the features which are helpful for recognition may be ignored. In this paper, we present a new and efficient method for illumination normalization using an energy minimization framework. The proposed method aims to remove the illumination field of the observed face images while simultaneously preserving the intrinsic facial features. The normalized face image and illumination field could be achieved by a reciprocal iteration scheme. Experiments on CMU-PIE and the Extended Yale B databases show that the proposed method can preserve a very good visual quality even on the images illuminated with deep shadow and high brightness regions, and obtain promising illumination normalization results for better face recognition performance.
Codebook Learning for Image Recognition Based on Parallel Key SIFT Analysis
Feng YANG Zheng MA Mei XIE

LETTER-Image Recognition, Computer Vision

Pubricized:
2017/01/10
Vol:
E100-D No:4
Page(s):
927-930
The quality of codebook is very important in visual image classification. In order to boost the classification performance, a scheme of codebook generation for scene image recognition based on parallel key SIFT analysis (PKSA) is presented in this paper. The method iteratively applies classical k-means clustering algorithm and similarity analysis to evaluate key SIFT descriptors (KSDs) from the input images, and generates the codebook by a relaxed k-means algorithm according to the set of KSDs. With the purpose of evaluating the performance of the PKSA scheme, the image feature vector is calculated by sparse code with Spatial Pyramid Matching (ScSPM) after the codebook is constructed. The PKSA-based ScSPM method is tested and compared on three public scene image datasets. The experimental results show the proposed scheme of PKSA can significantly save computational time and enhance categorization rate.
Action Recognition Using Weighted Locality-Constrained Linear Coding
Jiangfeng YANG Zheng MA

LETTER-Image Processing and Video Processing

Pubricized:
2014/10/31
Vol:
E98-D No:2
Page(s):
462-466
Recently, locality-constrained linear coding (LLC) as a coding strategy has attracted much attention, due to its better reconstruction than sparse coding and vector quantization. However, LLC ignores the weight information of codewords during the coding stage, and assumes that every selected base has same credibility, even if their weights are different. To further improve the discriminative power of LLC code, we propose a weighted LLC algorithm that considers the codeword weight information. Experiments on the KTH and UCF datasets show that the recognition system based on WLLC achieves better performance than that based on the classical LLC and VQ, and outperforms the recent classical systems.
Mixture Hyperplanes Approximation for Global Tracking
Song GU Zheng MA Mei XIE

LETTER-Pattern Recognition

Pubricized:
2015/08/13
Vol:
E98-D No:11
Page(s):
2008-2012
Template tracking has been extensively studied in Computer Vision with a wide range of applications. A general framework is to construct a parametric model to predict movement and to track the target. The difference in intensity between the pixels belonging to the current region and the pixels of the selected target allows a straightforward prediction of the region position in the current image. Traditional methods track the object based on the assumption that the relationship between the intensity difference and the region position is linear or non-linear. They will result in bad tracking performance when just one model is adopted. This paper proposes a method, called as Mixture Hyperplanes Approximation, which is based on finite mixture of generalized linear regression models to perform robust tracking. Moreover, a fast learning strategy is discussed, which improves the robustness against noise. Experiments demonstrate the performance and stability of Mixture Hyperplanes Approximation.
Spatio-Temporal Self-Attention Weighted VLAD Neural Network for Action Recognition
Shilei CHENG Mei XIE Zheng MA Siqi LI Song GU Feng YANG

LETTER-Biocybernetics, Neurocomputing

Pubricized:
2020/10/01
Vol:
E104-D No:1
Page(s):
220-224
As characterizing videos simultaneously from spatial and temporal cues have been shown crucial for video processing, with the shortage of temporal information of soft assignment, the vector of locally aggregated descriptor (VLAD) should be considered as a suboptimal framework for learning the spatio-temporal video representation. With the development of attention mechanisms in natural language processing, in this work, we present a novel model with VLAD following spatio-temporal self-attention operations, named spatio-temporal self-attention weighted VLAD (ST-SAWVLAD). In particular, sequential convolutional feature maps extracted from two modalities i.e., RGB and Flow are receptively fed into the self-attention module to learn soft spatio-temporal assignments parameters, which enabling aggregate not only detailed spatial information but also fine motion information from successive video frames. In experiments, we evaluate ST-SAWVLAD by using competitive action recognition datasets, UCF101 and HMDB51, the results shcoutstanding performance. The source code is available at:https://github.com/badstones/st-sawvlad.
A Generalized Diagonal Loading Robust Wideband Beam Pattern Synthesis Method
ChangZheng MA BoonPoh NG

LETTER-Digital Signal Processing

Vol:
E88-A No:2
Page(s):
590-592
Optimum wideband beam pattern synthesis methods are usually sensitive to antenna elements gain, phase and position errors. In this letter, these errors are taken into account in a constraint optimization process, and a generalized diagonal loading algorithm is obtained. Computer simulations indicate the robustness of this new method.

Author Search Result

[Author] Zheng MA(8hit)

Pre-Processing for Fine-Grained Image Classification

Visual Recognition Method Based on Hybrid KPCA Network

Illumination Normalization for Face Recognition Using Energy Minimization Framework

Codebook Learning for Image Recognition Based on Parallel Key SIFT Analysis

Action Recognition Using Weighted Locality-Constrained Linear Coding

Mixture Hyperplanes Approximation for Global Tracking

Spatio-Temporal Self-Attention Weighted VLAD Neural Network for Action Recognition

A Generalized Diagonal Loading Robust Wideband Beam Pattern Synthesis Method

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles