1-18hit |
M.K. JEEVARAJAN P. NIRMAL KUMAR
We present a reconfigurable deep learning pedestrian detection system for surveillance systems that detect people with shadows in different lighting and heavily occluded conditions. This work proposes a region-based CNN, combined with CMOS and thermal cameras to obtain human features even under poor lighting conditions. The main advantage of a reconfigurable system with respect to processor-based systems is its high performance and parallelism when processing large amount of data such as video frames. We discuss the details of hardware implementation in the proposed real-time pedestrian detection algorithm on a Zynq FPGA. Simulation results show that the proposed integrated approach of R-CNN architecture with cameras provides better performance in terms of accuracy, precision, and F1-score. The performance of Zynq FPGA was compared to other works, which showed that the proposed architecture is a good trade-off in terms of quality, accuracy, speed, and resource utilization.
Pedestrian detection is a significant task in computer vision. In recent years, it is widely used in applications such as intelligent surveillance systems and automated driving systems. Although it has been exhaustively studied in the last decade, the occlusion handling issue still remains unsolved. One convincing idea is to first detect human body parts, and then utilize the parts information to estimate the pedestrians' existence. Many parts-based pedestrian detection approaches have been proposed based on this idea. However, in most of these approaches, the low-quality parts mining and the clumsy part detector combination is a bottleneck that limits the detection performance. To eliminate the bottleneck, we propose Discriminative Part CNN (DP-CNN). Our approach has two main contributions: (1) We propose a high-quality body parts mining method based on both convolutional layer features and body part subclasses. The mined part clusters are not only discriminative but also representative, and can help to construct powerful pedestrian detectors. (2) We propose a novel method to combine multiple part detectors. We convert the part detectors to a middle layer of a CNN and optimize the whole detection pipeline by fine-tuning that CNN. In experiments, it shows astonishing effectiveness of optimization and robustness of occlusion handling.
Chen CHEN Maojun ZHANG Hanlin TAN Huaxin XIAO
Pedestrian detection is an essential but challenging task in computer vision, especially in crowded scenes due to heavy intra-class occlusion. In human visual system, head information can be used to locate pedestrian in a crowd because it is more stable and less likely to be occluded. Inspired by this clue, we propose a dual-task detector which detects head and human body simultaneously. Concretely, we estimate human body candidates from head regions with statistical head-body ratio. A head-body alignment map is proposed to perform relational learning between human bodies and heads based on their inherent correlation. We leverage the head information as a strict detection criterion to suppress common false positives of pedestrian detection via a novel pull-push loss. We validate the effectiveness of the proposed method on the CrowdHuman and CityPersons benchmarks. Experimental results demonstrate that the proposed method achieves impressive performance in detecting heavy-occluded pedestrians with little additional computation cost.
Chen CHEN Huaxin XIAO Yu LIU Maojun ZHANG
Pedestrian detection is a critical problem in computer vision with significant impact on many real-world applications. In this paper, we introduce an fast dual-task pedestrian detector with integrated segmentation context (DTISC) which predicts pedestrian location as well as its pixel-wise segmentation. The proposed network has three branches where two main branches can independently complete their tasks while useful representations from each task are shared between two branches via the integration branch. Each branch is based on fully convolutional network and is proven effective in its own task. We optimize the detection and segmentation branch on separate ground truths. With reasonable connections, the shared features introduce additional supervision and clues into each branch. Consequently, the two branches are infused at feature spaces increasing their robustness and comprehensiveness. Extensive experiments on pedestrian detection and segmentation benchmarks demonstrate that our joint model improves the performance of detection and segmentation against state-of-the-art algorithms.
Rui SUN Huihui WANG Jun ZHANG Xudong ZHANG
As a research hotspot and difficulty in the field of computer vision, pedestrian detection has been widely used in intelligent driving and traffic monitoring. The popular detection method at present uses region proposal network (RPN) to generate candidate regions, and then classifies the regions. But the RPN produces many erroneous candidate areas, causing region proposals for false positives to increase. This letter uses improved residual attention network to capture the visual attention map of images, then normalized to get the attention score map. The attention score map is used to guide the RPN network to generate more precise candidate regions containing potential target objects. The region proposals, confidence scores, and features generated by the RPN are used to train a cascaded boosted forest classifier to obtain the final results. The experimental results show that our proposed approach achieves highly competitive results on the Caltech and ETH datasets.
Ming XU Xiaosheng YU Chengdong WU Dongyue CHEN
A robust pedestrian detection approach in thermal infrared imageries for an all-day surveillance is proposed. Firstly, the candidate regions which are likely to contain pedestrians are extracted based on a saliency detection method. Then a deep convolutional network with a multi-task loss is constructed to recognize the pedestrians. The experimental results show the superiority of the proposed approach in pedestrian detection.
Kota IWANAGA Keiji JIMI Isamu MATSUNAMI
Case studies have reported that pedestrian detection methods using vehicle radar are not complete systems because each system has specific limitations at the cost of the calculating amounts, the system complexity or the range resolution. In this letter, we proposed a novel pedestrian detection method by template matching using Gabor filter bank, which was evaluated based on the data observed by 24GHz UWB radar.
Numerous studies have been focusing on the improvement of bag of features (BOF), histogram of oriented gradient (HOG) and scale invariant feature transform (SIFT). However, few works have attempted to learn the connection between them even though the latter two are widely used as local feature descriptor for the former one. Motivated by the resemblance between BOF and HOG/SIFT in the descriptor construction, we improve the performance of HOG/SIFT by a) interpreting HOG/SIFT as a variant of BOF in descriptor construction, and then b) introducing recently proposed approaches of BOF such as locality preservation, data-driven vocabulary, and spatial information preservation into the descriptor construction of HOG/SIFT, which yields the BOF-driven HOG/SIFT. Experimental results show that the BOF-driven HOG/SIFT outperform the original ones in pedestrian detection (for HOG), scene matching and image classification (for SIFT). Our proposed BOF-driven HOG/SIFT can be easily applied as replacements of the original HOG/SIFT in current systems since they are generalized versions of the original ones.
In this paper, an efficient method to reduce computational complexity for pedestrian detection is presented. Since trilinear interpolation is not used, the amount of required operations for histogram of oriented gradient (HOG) feature calculation is significantly reduced. By calculating multi-scale HOG features with integral HOG in a two-stage approach, both high detection rate and speed are achieved in the proposed method.
Ahmed BOUDISSA Joo Kooi TAN Hyoungseop KIM Takashi SHINOMIYA Seiji ISHIKAWA
This paper introduces a simple algorithm for pedestrian detection on low resolution images. The main objective is to create a successful means for real-time pedestrian detection. While the framework of the system consists of edge orientations combined with the local binary patterns (LBP) feature extractor, a novel way of selecting the threshold is introduced. Using the mean-variance of the background examples this threshold improves significantly the detection rate as well as the processing time. Furthermore, it makes the system robust to uniformly cluttered backgrounds, noise and light variations. The test data is the INRIA pedestrian dataset and for the classification, a support vector machine with a radial basis function (RBF) kernel is used. The system performs at state-of-the-art detection rates while being intuitive as well as very fast which leaves sufficient processing time for further operations such as tracking and danger estimation.
Chunsheng HUA Yasushi MAKIHARA Yasushi YAGI
In this paper, we propose a pedestrian detection algorithm based on both appearance and motion features to achieve high detection accuracy when applied to complex scenes. Here, a pedestrian's appearance is described by a histogram of oriented spatial gradients, and his/her motion is represented by another histogram of temporal gradients computed from successive frames. Since pedestrians typically exhibit not only their human shapes but also unique human movements generated by their arms and legs, the proposed algorithm is particularly powerful in discriminating a pedestrian from a cluttered situation, where some background regions may appear to have human shapes, but their motion differs from human movement. Unlike the algorithm based on a co-occurrence feature descriptor where significant generalization errors may arise owing to the lack of extensive training samples to cover feature variations, the proposed algorithm describes the shape and motion as unique features. These features enable us to train a pedestrian detector in the form of a spatio-temporal histogram of oriented gradients using the AdaBoost algorithm with a relatively small training dataset, while still achieving excellent detection performance. We have confirmed the effectiveness of the proposed algorithm through experiments on several public datasets.
Jiu XU Ning JIANG Satoshi GOTO
In this paper, a novel feature named bidirectional local template patterns (B-LTP) is proposed for use in pedestrian detection in still images. B-LTP is a combination and modification of two features, histogram of templates (HOT) and center-symmetric local binary patterns (CS-LBP). For each pixel, B-LTP defines four templates, each of which contains the pixel itself and two neighboring center-symmetric pixels. For each template, it then calculates information from the relationships among these three pixels and from the two directional transitions across these pixels. Moreover, because the feature length of B-LTP is small, it consumes less memory and computational power. Experimental results on an INRIA dataset show that the speed and detection rate of our proposed B-LTP feature outperform those of other features such as histogram of orientated gradient (HOG), HOT, and covariance matrix (COV).
Ning JIANG Jiu XU Satoshi GOTO
In recent years, local pattern based features have attracted increasing interest in object detection and recognition systems. Local Binary Pattern (LBP) feature is widely used in texture classification and face detection. But the original definition of LBP is not suitable for human detection. In this paper, we propose a novel feature named gradient local binary patterns (GLBP) for human detection. In this feature, original 256 local binary patterns are reduced to 56 patterns. These 56 patterns named uniform patterns are used for generating a 56-bin histogram. And gradient value of each pixel is set as the weight which is always same in LBP based features in histogram calculation to computing the values in 56 bins for histogram. Experiments are performed on INRIA dataset, which shows the proposal GLBP feature is discriminative than histogram of orientated gradient (HOG), Semantic Local Binary Patterns (S-LBP) and histogram of template (HOT). In our experiments, the window size is fixed. That means the performance can be improved by boosting methods. And the computation of GLBP feature is parallel, which make it easy for hardware acceleration. These factors make GLBP feature possible for real-time pedestrian detection.
Ryusuke MIYAMOTO Hiroki SUGANO
Pedestrian detection from visual images, which is used for driver assistance or video surveillance, is a recent challenging problem. Co-occurrence histograms of oriented gradients (CoHOG) is a powerful feature descriptor for pedestrian detection and achieves the highest detection accuracy. However, its calculation cost is too large to calculate it in real-time on state-of-the-art processors. In this paper, to obtain optimal parallel implementation for an NVIDIA GPU, several kinds of parallelism of CoHOG-based detection are shown and evaluated suitability for implementation. The experimental result shows that the detection process can be performed at 16.5 fps in QVGA images on NVIDIA Tesla C1060 by optimized parallel implementation. By our evaluation, it is shown that the optimal strategy of parallel implementation for an NVIDIA GPU is different from that of FPGA. We discuss about the reason and show the advantages of each device. To show the scalability and portability of GPU implementation, the same object code is executed on other NVIDA GPUs. The experimental result shows that GTX570 can perform the CoHOG-based pedestiran detection 21.3 fps in QVGA images.
In this paper, we deal with the pedestrian detection task in outdoor scenes. Because of the complexity of such scenes, generally used gradient-feature-based detectors do not work well on them. We propose to use sparse 3D depth information as an additional cue to do the detection task, in order to achieve a fast improvement in performance. Our proposed method uses a probabilistic model to integrate image-feature-based classification with sparse depth estimation. Benefiting from the depth estimates, we map the prior distribution of human's actual height onto the image, and update the image-feature-based classification result probabilistically. We have two contributions in this paper: 1) a simplified graphical model which can efficiently integrate depth cue in detection; and 2) a sparse depth estimation method which could provide fast and reliable estimation of depth information. An experiment shows that our method provides a promising enhancement over baseline detector within minimal additional time.
Chang LIU Guijin WANG Chunxiao LIU Xinggang LIN
Boosting over weak classifiers is widely used in pedestrian detection. As the number of weak classifiers is large, researchers always use a sampling method over weak classifiers before training. The sampling makes the boosting process harder to reach the fixed target. In this paper, we propose a partial derivative guidance for weak classifier mining method which can be used in conjunction with a boosting algorithm. Using weak classifier mining method makes the sampling less degraded in the performance. It has the same effect as testing more weak classifiers while using acceptable time. Experiments demonstrate that our algorithm can process quicker than [1] algorithm in both training and testing, without any performance decrease. The proposed algorithms is easily extending to any other boosting algorithms using a window-scanning style and HOG-like features.
Xue YUAN Xue-Ye WEI Yong-Duan SONG
This paper presents a pedestrian detection framework using a top-view camera. The paper contains two novel contributions for the pedestrian detection task: 1. Using shape context method to estimate the pedestrian directions and normalizing the pedestrian regions. 2. Based on the locations of the extracted head candidates, system chooses the most adaptive classifier from several classifiers automatically. Our proposed methods may solve the difficulties on top-view pedestrian detection field. Experimental was performed on video sequences with different illumination and crowed conditions, the experimental results demonstrate the efficiency of our algorithm.
Hui CAO Koichiro YAMAGUCHI Mitsuhiko OHTA Takashi NAITO Yoshiki NINOMIYA
We propose a novel representation called Feature Interaction Descriptor (FIND) to capture high-level properties of object appearance by computing pairwise interactions of adjacent region-level features. In order to deal with pedestrian detection task, we employ localized oriented gradient histograms as region-level features and measure interactions between adjacent histogram elements with a suitable histogram-similarity function. The experimental results show that our descriptor improves upon HOG significantly and outperforms related high-level features such as GLAC and CoHOG.