IEICE global.ieice.org Site

Author Search Result

[Author] Gang LI(45hit)

1-20hit(45hit)

Self-Clustering Symmetry Detection
Bei HE Guijin WANG Chenbo SHI Xuanwu YIN Bo LIU Xinggang LIN

LETTER-Image Recognition, Computer Vision

Vol:
E95-D No:9
Page(s):
2359-2362
This paper presents a self-clustering algorithm to detect symmetry in images. We combine correlations of orientations, scales and descriptors as a triple feature vector to evaluate each feature pair while low confidence pairs are regarded as outliers and removed. Additionally, all confident pairs are preserved to extract potential symmetries since one feature point may be shared by different pairs. Further, each feature pair forms one cluster and is merged and split iteratively based on the continuity in the Cartesian and concentration in the polar coordinates. Pseudo symmetric axes and outlier midpoints are eliminated during the process. Experiments demonstrate the robustness and accuracy of our algorithm visually and quantitatively.
DSP-Based Parallel Implementation of Speeded-Up Robust Features
Chao LIAO Guijin WANG Quan MIAO Zhiguo WANG Chenbo SHI Xinggang LIN

LETTER-Image Recognition, Computer Vision

Vol:
E94-D No:4
Page(s):
930-933
Robust local image features have become crucial components of many state-of-the-art computer vision algorithms. Due to limited hardware resources, computing local features on embedded system is not an easy task. In this paper, we propose an efficient parallel computing framework for speeded-up robust features with an orientation towards multi-DSP based embedded system. We optimize modules in SURF to better utilize the capability of DSP chips. We also design a compact data layout to adapt to the limited memory resource and to increase data access bandwidth. A data-driven barrier and workload balance schemes are presented to synchronize parallel working chips and reduce overall cost. The experiment shows our implementation achieves competitive time efficiency compared with related works.
Measuring Particles in Joint Feature-Spatial Space
Liang SHA Guijin WANG Anbang YAO Xinggang LIN

LETTER-Vision

Vol:
E92-A No:7
Page(s):
1737-1742
Particle filter has attracted increasing attention from researchers of object tracking due to its promising property of handling nonlinear and non-Gaussian systems. In this paper, we mainly explore the problem of precisely estimating observation likelihoods of particles in the joint feature-spatial space. For this purpose, a mixture Gaussian kernel function based similarity is presented to evaluate the discrepancy between the target region and the particle region. Such a similarity can be interpreted as the expectation of the spatial weighted feature distribution over the target region. To adapt outburst of object motion, we also present a method to appropriately adjust state transition model by utilizing the priors of motion speed and object size. In comparison with the standard particle filter tracker, our tracking algorithm shows the better performance on challenging video sequences.
Stereo Matching Using Local Plane Fitting in Confidence-Based Support Window
Chenbo SHI Guijin WANG Xiaokang PEI Bei HE Xinggang LIN

LETTER-Image Recognition, Computer Vision

Vol:
E95-D No:2
Page(s):
699-702
This paper addresses stereo matching under scenarios of smooth region and obviously slant plane. We explore the flexible handling of color disparity, spatial relation and the reliability of matching pixels in support windows. Building upon these key ingredients, a robust stereo matching algorithm using local plane fitting by Confidence-based Support Window (CSW) is presented. For each CSW, only these pixels with high confidence are employed to estimate optimal disparity plane. Considering that RANSAC has shown to be robust in suppressing the disturbance resulting from outliers, we employ it to solve local plane fitting problem. Compared with the state of the art local methods in the computer vision community, our approach achieves the better performance and time efficiency on the Middlebury benchmark.
A Driver Fatigue Detection Algorithm Based on Dynamic Tracking of Small Facial Targets Using YOLOv7
Shugang LIU Yujie WANG Qiangguo YU Jie ZHAN Hongli LIU Jiangtao LIU

PAPER-Image Recognition, Computer Vision

Pubricized:
2023/08/21
Vol:
E106-D No:11
Page(s):
1881-1890
Driver fatigue detection has become crucial in vehicle safety technology. Achieving high accuracy and real-time performance in detecting driver fatigue is paramount. In this paper, we propose a novel driver fatigue detection algorithm based on dynamic tracking of Facial Eyes and Yawning using YOLOv7, named FEY-YOLOv7. The Coordinate Attention module is inserted into YOLOv7 to enhance its dynamic tracking accuracy by focusing on coordinate information. Additionally, a small target detection head is incorporated into the network architecture to promote the feature extraction ability of small facial targets such as eyes and mouth. In terms of compution, the YOLOv7 network architecture is significantly simplified to achieve high detection speed. Using the proposed PERYAWN algorithm, driver status is labeled and detected by four classes: open_eye, closed_eye, open_mouth, and closed_mouth. Furthermore, the Guided Image Filtering algorithm is employed to enhance image details. The proposed FEY-YOLOv7 is trained and validated on RGB-infrared datasets. The results show that FEY-YOLOv7 has achieved mAP of 0.983 and FPS of 101. This indicates that FEY-YOLOv7 is superior to state-of-the-art methods in accuracy and speed, providing an effective and practical solution for image-based driver fatigue detection.
A Novel Double-Tail Generative Adversarial Network for Fast Photo Animation
Gang LIU Xin CHEN Zhixiang GAO

PAPER-Artificial Intelligence, Data Mining

Pubricized:
2023/09/28
Vol:
E107-D No:1
Page(s):
72-82
Photo animation is to transform photos of real-world scenes into anime style images, which is a challenging task in AIGC (AI Generated Content). Although previous methods have achieved promising results, they often introduce noticeable artifacts or distortions. In this paper, we propose a novel double-tail generative adversarial network (DTGAN) for fast photo animation. DTGAN is the third version of the AnimeGAN series. Therefore, DTGAN is also called AnimeGANv3. The generator of DTGAN has two output tails, a support tail for outputting coarse-grained anime style images and a main tail for refining coarse-grained anime style images. In DTGAN, we propose a novel learnable normalization technique, termed as linearly adaptive denormalization (LADE), to prevent artifacts in the generated images. In order to improve the visual quality of the generated anime style images, two novel loss functions suitable for photo animation are proposed: 1) the region smoothing loss function, which is used to weaken the texture details of the generated images to achieve anime effects with abstract details; 2) the fine-grained revision loss function, which is used to eliminate artifacts and noise in the generated anime style image while preserving clear edges. Furthermore, the generator of DTGAN is a lightweight generator framework with only 1.02 million parameters in the inference phase. The proposed DTGAN can be easily end-to-end trained with unpaired training data. Extensive experiments have been conducted to qualitatively and quantitatively demonstrate that our method can produce high-quality anime style images from real-world photos and perform better than the state-of-the-art models.
Multiple-Shot Person Re-Identification by Pairwise Multiple Instance Learning
Chunxiao LIU Guijin WANG Xinggang LIN

LETTER-Image Recognition, Computer Vision

Vol:
E96-D No:12
Page(s):
2900-2903
Learning an appearance model for person re-identification from multiple images is challenging due to the corrupted images caused by occlusion or false detection. Furthermore, different persons may wear similar clothes, making appearance feature less discriminative. In this paper, we first introduce the concept of multiple instance to handle corrupted images. Then a novel pairwise comparison based multiple instance learning framework is proposed to deal with visual ambiguity, by selecting robust features through pairwise comparison. We demonstrate the effectiveness of our method on two public datasets.
Deformable Part-Based Model Transfer for Object Detection
Zhiwei RUAN Guijin WANG Xinggang LIN Jing-Hao XUE Yong JIANG

LETTER-Image Recognition, Computer Vision

Vol:
E97-D No:5
Page(s):
1394-1397
The transfer of prior knowledge from source domains can improve the performance of learning when the training data in a target domain are insufficient. In this paper we propose a new strategy to transfer deformable part models (DPMs) for object detection, using offline-trained auxiliary DPMs of similar categories as source models to improve the performance of the target object detector. A DPM presents an object by using a root filter and several part filters. We use these filters of the auxiliary DPMs as prior knowledge and adapt the filters to the target object. With a latent transfer learning method, appropriate local features are extracted for the transfer of part filters. Our experiments demonstrate that this strategy can lead to a detector superior to some state-of-the-art methods.
New Method to Extend the Number of Quaternary Low Correlation Zone Sequence Sets
Chengqian XU Yubo LI Kai LIU Gang LI

LETTER-Information Theory

Vol:
E94-A No:9
Page(s):
1881-1885
In this correspondence, a new method to extend the number of quaternary low correlation zone (LCZ) sequence sets is presented. Based on the inverse Gray mapping and a binary sequence with ideal two-level auto-correlation function, numbers of quaternary LCZ sequence sets can be generated by choosing different parameters. There is at most one sequence cyclically equivalent in different LCZ sequence sets. The parameters of LCZ sequence sets are flexible.
The Constructions of Almost Binary Sequence Pairs and Binary Sequence Pairs with Three-Level Autocorrelation
Xiuping PENG Chengqian XU Gang LI Kai LIU Krishnasamy Thiru ARASU

LETTER-Information Theory

Vol:
E94-A No:9
Page(s):
1886-1891
In this letter, a new class of almost binary sequence pairs with a single zero element and three autocorrelation values is presented. The new almost binary sequence pairs are based on cyclic difference sets and difference set pairs. By applying the method to the binary sequence pairs, new binary sequence pairs with three-level autocorrelation are constructed. It is shown that new sequence pairs from our constructions are balanced or almost balanced and have optimal three-level autocorrelation when the characteristic sequences or sequence pairs of difference sets or difference set pairs are balanced or almost balanced and have optimal autocorrelations.
A Real-Time Human Detection System for Video
Bobo ZENG Guijin WANG Xinggang LIN Chunxiao LIU

PAPER-Image Recognition, Computer Vision

Vol:
E95-D No:7
Page(s):
1979-1988
This work presents a real-time human detection system for VGA (Video Graphics Array, 640480) video, which well suits visual surveillance applications. To achieve high running speed and accuracy, firstly we design multiple fast scalar feature types on the gradient channels, and experimentally identify that NOGCF (Normalized Oriented Gradient Channel Feature) has better performance with Gentle AdaBoost in cascaded classifiers. A confidence measure for cascaded classifiers is developed and utilized in the subsequent tracking stage. Secondly, we propose to use speedup techniques including a detector pyramid for multi-scale detection and channel compression for integral channel calculation respectively. Thirdly, by integrating the detector's discrete detected humans and continuous detection confidence map, we employ a two-layer tracking by detection algorithm for further speedup and accuracy improvement. Compared with other methods, experiments show the system is significantly faster with 20 fps running speed in VGA video and has better accuracy as well.
An Iterative Factorization Method Based on Rank 1 for Projective Structure and Motion
Shigang LIU Chengke WU Li TANG Jing JIA

PAPER-Image Recognition, Computer Vision

Vol:
E88-D No:9
Page(s):
2183-2188
We propose a method for the recovery of projective structure and motion by the factorization of the rank 1 matrix containing the images of all points in all views. In our method, the unknowns are the 3D motion and relative depths of the set of points, not their 3D positions. The coordinates of the points along the camera plane are given by their image positions in the first frame. The knowledge of the coordinates along the camera plane enables us to solve the SFM problem by iteratively factorizing the rank 1 matrix. This simplifies the decomposition compared with the SVD (Singular Value Decomposition). Experiments with both simulated and real data show that the method is efficient for the recovery of projective structure and motion.
Towards High-Performance Load-Balance Multicast Switch via Erasure Codes
Fuxing CHEN Li MA Weiyang LIU Dagang LI Dongcheng WU

PAPER-Fundamental Theories for Communications

Vol:
E98-B No:8
Page(s):
1518-1525
Recent studies on switching fabrics mainly focus on the switching schedule algorithms, which aim at improving the throughput (a key performance metric). However, the delay (another key performance metric) of switching fabrics cannot be well guaranteed. A good switching fabric should be endowed with the properties of high throughput, delay guarantee, low component complexity and high-speed multicast, which are difficult for conventional switching fabrics to achieve. This has fueled great interest in designing a new switching fabric that can support large-scale extension and high-speed multicast. Motivated by this, we reuse the self-routing Boolean concentrator network and embed a model of multicast packet copy separation in front to construct a load-balanced multicast switching fabric (LB-MSF) with delay guarantee. The first phase of LB-MSF is responsible for balancing the incoming traffic into uniform cells while the second phase is in charge of self-routing the cells to their final destinations. In order to improve the throughput, LB-MSF is combined with the merits of erasure codes against packet loss. Experiments and analyses verify that the proposed fabric is able to achieve high-speed multicast switching and suitable for building super large-scale switching fabric in Next Generation Network(NGN) with all the advantages mentioned above. Furthermore, a prototype of the proposed switch is developed on FPGA, and presents excellent performance.
Real-Time Human Detection Using Hierarchical HOG Matrices
Guan PANG Guijin WANG Xinggang LIN

LETTER-Image Recognition, Computer Vision

Vol:
E93-D No:3
Page(s):
658-661
Human detection has witnessed significant development in recent years. The introduction of cascade structure and integral histogram has greatly improved detection speed. But real-time detection is still only possible for sparse scan of 320 240 sized images. In this work, we propose a matrix-based structure to reorganize the computation structure of window-scanning detection algorithms, as well as a new pre-processing method called Hierarchical HOG Matrices (HHM) in place of integral histogram. Our speed-up scheme can process 320 240 sized images by dense scan (≈ 12000 windows per image) at the speed of about 30 fps, while maintaining accuracy comparable to the original HOG + cascade method.
A Framework of Real Time Hand Gesture Vision Based Human-Computer Interaction
Liang SHA Guijin WANG Xinggang LIN Kongqiao WANG

PAPER-Vision

Vol:
E94-A No:3
Page(s):
979-989
This paper presents a robust framework of human-computer interaction from the hand gesture vision in the presence of realistic and challenging scenarios. To this end, several novel components are proposed. A hybrid approach is first proposed to automatically infer the beginning position of hand gestures of interest via jointly optimizing the regions given by an offline skin model trained from Gaussian mixture models and a specific hand gesture classifier trained from the Adaboost technique. To consistently track the hand in the context of using kernel based tracking, a semi-supervised feature selection strategy is further presented to choose the feature subspaces which appropriately represent the properties of offline hand skin cues and online foreground-background-classification cues. Taking the histogram of oriented gradients as the descriptor to represent hand gestures, a soft-decision approach is finally proposed for recognizing static hand gestures at the locations where severe ambiguity occurs and hidden Markov model based dynamic gestures are employed for interaction. Experiments on various real video sequences show the superior performance of the proposed components. In addition, the whole framework is applicable to real-time applications on general computing platforms.
A New Construction of Optimal LCZ Sequence Sets
Yubo LI Chengqian XU Kai LIU Gang LI Sai YU

LETTER-Communication Theory and Signals

Vol:
E95-A No:9
Page(s):
1646-1650
In this correspondence, we devise a new method for constructing a ternary column sequence set of length 3m+1-1 form ternary sequences of period 3m-1 with ideal autocorrelation, and the ternary LCZ sequence set of period 3n-1 is constructed by using the column sequence set when (m+1)|n. In addition, the method is popularized to the p-ary LCZ sequence. The resultant LCZ sequence sets in this paper are optimal with respect to the Tang-Fan-Matsufuji bound.
A Selective Video Encryption Scheme for MPEG Compression Standard
Gang LIU Takeshi IKENAGA Satoshi GOTO Takaaki BABA

PAPER-Application

Vol:
E89-A No:1
Page(s):
194-202
With the increase of commercial multimedia applications using digital video, the security of video data becomes more and more important. Although several techniques have been proposed in order to protect these video data, they provide limited security or introduce significant overhead. This paper proposes a video security scheme for MPEG video compression standard, which includes two methods: DCEA (DC Coefficient Encryption Algorithm) and "Event Shuffle." DCEA is aim to encrypt group of codewords of DC coefficients. The feature of this method is the usage of data permutation to scatter the ciphertexts of additional codes in DC codewords. These additional codes are encrypted by block cipher previously. With the combination of these algorithms, the method provides enough security for important DC component of MPEG video data. "Event Shuffle" is aim to encrypt the AC coefficients. The prominent feature of this method is a shuffling of AC events generated after DCT transformation and quantization stages. Experimental results show that these methods introduce no bit overhead to MPEG bit stream while achieving low processing overhead to MPEG codec.
Fast Generation of View-Direction-Free Perspective Display from Distorted Fisheye Image
Shigang LI Ying HAI

LETTER-Image Processing and Video Processing

Vol:
E92-D No:8
Page(s):
1588-1591
This paper introduces an intermediate virtual representation, called ideal fisheye image, which obeys the ideal simple projection without camera distortion. By using a look-up-table from the ideal fisheye image to the input fisheye image with distortion, a view-direction-free perspective display can be generated fast in comparison with the method of solving a set of nonlinear equations of camera distortion parameters.
An Interleaving Updating Framework of Disparity and Confidence Map for Stereo Matching
Chenbo SHI Guijin WANG Xiaokang PEI Bei HE Xinggang LIN

LETTER-Image Recognition, Computer Vision

Vol:
E95-D No:5
Page(s):
1552-1555
In this paper, we propose an interleaving updating framework of disparity and confidence map (IUFDCM) for stereo matching to eliminate the redundant and interfere information from unreliable pixels. Compared with other propagation algorithms using matching cost as messages, IUFDCM updates the disparity map and the confidence map in an interleaving manner instead. Based on the Confidence-based Support Window (CSW), disparity map is updated adaptively to alleviate the effect of input parameters. The reassignment for unreliable pixels with larger probability keeps ground truth depending on reliable messages. Consequently, the confidence map is updated according to the previous disparity map and the left-right consistency. The top ranks on Middlebury benchmark corresponding to different error thresholds demonstrate that our algorithm is competitive with the best stereo matching algorithms at present.
Robust Object Tracking via Combining Observation Models
Fan JIANG Guijin WANG Chang LIU Xinggang LIN Weiguo WU

LETTER-Image Recognition, Computer Vision

Vol:
E93-D No:3
Page(s):
662-665
Various observation models have been introduced into the object tracking community, and combining them has become a promising direction. This paper proposes a novel approach for estimating the confidences of different observation models, and then effectively combining them in the particle filter framework. In our approach, spatial Likelihood distribution is represented by three simple but efficient parameters, reflecting the overall similarity, distribution sharpness and degree of multi peak. The balance of these three aspects leads to good estimation of confidences, which helps maintain the advantages of each observation model and further increases robustness to partial occlusion. Experiments on challenging video sequences demonstrate the effectiveness of our approach.

1-20hit(45hit)

Author Search Result

[Author] Gang LI(45hit)

Self-Clustering Symmetry Detection

DSP-Based Parallel Implementation of Speeded-Up Robust Features

Measuring Particles in Joint Feature-Spatial Space

Stereo Matching Using Local Plane Fitting in Confidence-Based Support Window

A Driver Fatigue Detection Algorithm Based on Dynamic Tracking of Small Facial Targets Using YOLOv7

A Novel Double-Tail Generative Adversarial Network for Fast Photo Animation

Multiple-Shot Person Re-Identification by Pairwise Multiple Instance Learning

Deformable Part-Based Model Transfer for Object Detection

New Method to Extend the Number of Quaternary Low Correlation Zone Sequence Sets

The Constructions of Almost Binary Sequence Pairs and Binary Sequence Pairs with Three-Level Autocorrelation

A Real-Time Human Detection System for Video

An Iterative Factorization Method Based on Rank 1 for Projective Structure and Motion

Towards High-Performance Load-Balance Multicast Switch via Erasure Codes

Real-Time Human Detection Using Hierarchical HOG Matrices

A Framework of Real Time Hand Gesture Vision Based Human-Computer Interaction

A New Construction of Optimal LCZ Sequence Sets

A Selective Video Encryption Scheme for MPEG Compression Standard

Fast Generation of View-Direction-Free Perspective Display from Distorted Fisheye Image

An Interleaving Updating Framework of Disparity and Confidence Map for Stereo Matching

Robust Object Tracking via Combining Observation Models

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles