Yang YU Longlong LIU Ye ZHU Shixin CEN Yang LI
Pedestrian attribute recognition (PAR) aims to recognize a series of a person's semantic attributes, e.g., age, gender, which plays an important role in video surveillance. This paper proposes a multi-correlation graph convolutional network named MCGCN for PAR, which includes a semantic graph, visual graph, and synthesis graph. We construct a semantic graph by using attribute features with semantic constraints. A graph convolution is employed, based on prior knowledge of the dataset, to learn the semantic correlation. 2D features are projected onto visual graph nodes and each node corresponds to the feature region of each attribute group. Graph convolution is then utilized to learn regional correlation. The visual graph nodes are connected to the semantic graph nodes to form a synthesis graph. In the synthesis graph, regional and semantic correlation are embedded into each other through inter-graph edges, to guide each other's learning and to update the visual and semantic graph, thereby constructing semantic and regional correlation. On this basis, we use a better loss weighting strategy, the suit_polyloss, to address the imbalance of pedestrian attribute datasets. Experiments on three benchmark datasets show that the proposed approach achieves superior recognition performance compared to existing technologies, and achieves state-of-the-art performance.
Shangdong LIU Chaojun MEI Shuai YOU Xiaoliang YAO Fei WU Yimu JI
The thermal imaging pedestrian segmentation system has excellent performance in different illumination conditions, but it has some drawbacks(e.g., weak pedestrian texture information, blurred object boundaries). Meanwhile, high-performance large models have higher latency on edge devices with limited computing performance. To solve the above problems, in this paper, we propose a real-time thermal infrared pedestrian segmentation method. The feature extraction layers of our method consist of two paths. Firstly, we utilize the lossless spatial downsampling to obtain boundary texture details on the spatial path. On the context path, we use atrous convolutions to improve the receptive field and obtain more contextual semantic information. Then, the parameter-free attention mechanism is introduced at the end of the two paths for effective feature selection, respectively. The Feature Fusion Module (FFM) is added to fuse the semantic information of the two paths after selection. Finally, we accelerate method inference through multi-threading techniques on the edge computing device. Besides, we create a high-quality infrared pedestrian segmentation dataset to facilitate research. The comparative experiments on the self-built dataset and two public datasets with other methods show that our method also has certain effectiveness. Our code is available at https://github.com/mcjcs001/LEIPNet.
M.K. JEEVARAJAN P. NIRMAL KUMAR
We present a reconfigurable deep learning pedestrian detection system for surveillance systems that detect people with shadows in different lighting and heavily occluded conditions. This work proposes a region-based CNN, combined with CMOS and thermal cameras to obtain human features even under poor lighting conditions. The main advantage of a reconfigurable system with respect to processor-based systems is its high performance and parallelism when processing large amount of data such as video frames. We discuss the details of hardware implementation in the proposed real-time pedestrian detection algorithm on a Zynq FPGA. Simulation results show that the proposed integrated approach of R-CNN architecture with cameras provides better performance in terms of accuracy, precision, and F1-score. The performance of Zynq FPGA was compared to other works, which showed that the proposed architecture is a good trade-off in terms of quality, accuracy, speed, and resource utilization.
Pedestrian detection is a significant task in computer vision. In recent years, it is widely used in applications such as intelligent surveillance systems and automated driving systems. Although it has been exhaustively studied in the last decade, the occlusion handling issue still remains unsolved. One convincing idea is to first detect human body parts, and then utilize the parts information to estimate the pedestrians' existence. Many parts-based pedestrian detection approaches have been proposed based on this idea. However, in most of these approaches, the low-quality parts mining and the clumsy part detector combination is a bottleneck that limits the detection performance. To eliminate the bottleneck, we propose Discriminative Part CNN (DP-CNN). Our approach has two main contributions: (1) We propose a high-quality body parts mining method based on both convolutional layer features and body part subclasses. The mined part clusters are not only discriminative but also representative, and can help to construct powerful pedestrian detectors. (2) We propose a novel method to combine multiple part detectors. We convert the part detectors to a middle layer of a CNN and optimize the whole detection pipeline by fine-tuning that CNN. In experiments, it shows astonishing effectiveness of optimization and robustness of occlusion handling.
Chen CHEN Maojun ZHANG Hanlin TAN Huaxin XIAO
Pedestrian detection is an essential but challenging task in computer vision, especially in crowded scenes due to heavy intra-class occlusion. In human visual system, head information can be used to locate pedestrian in a crowd because it is more stable and less likely to be occluded. Inspired by this clue, we propose a dual-task detector which detects head and human body simultaneously. Concretely, we estimate human body candidates from head regions with statistical head-body ratio. A head-body alignment map is proposed to perform relational learning between human bodies and heads based on their inherent correlation. We leverage the head information as a strict detection criterion to suppress common false positives of pedestrian detection via a novel pull-push loss. We validate the effectiveness of the proposed method on the CrowdHuman and CityPersons benchmarks. Experimental results demonstrate that the proposed method achieves impressive performance in detecting heavy-occluded pedestrians with little additional computation cost.
This paper describes designing a new pedestrian navigation system using a comprehensive framework called the pedestrian navigation concept reference model (PNCRM). We implement this system as a publicly-available smartphone application and evaluate its positioning performance near Omiya station's western entrance. We also evaluate users' subjective impressions of the system using a questionnaire. In both cases, promising results are obtained, showing that the PNCRM can be used as a tool for designing pedestrian navigation systems, allowing such systems to be created systematically.
Chen CHEN Huaxin XIAO Yu LIU Maojun ZHANG
Pedestrian detection is a critical problem in computer vision with significant impact on many real-world applications. In this paper, we introduce an fast dual-task pedestrian detector with integrated segmentation context (DTISC) which predicts pedestrian location as well as its pixel-wise segmentation. The proposed network has three branches where two main branches can independently complete their tasks while useful representations from each task are shared between two branches via the integration branch. Each branch is based on fully convolutional network and is proven effective in its own task. We optimize the detection and segmentation branch on separate ground truths. With reasonable connections, the shared features introduce additional supervision and clues into each branch. Consequently, the two branches are infused at feature spaces increasing their robustness and comprehensiveness. Extensive experiments on pedestrian detection and segmentation benchmarks demonstrate that our joint model improves the performance of detection and segmentation against state-of-the-art algorithms.
Mahmud Dwi SULISTIYO Yasutomo KAWANISHI Daisuke DEGUCHI Ichiro IDE Takatsugu HIRAYAMA Jiang-Yu ZHENG Hiroshi MURASE
Numerous applications such as autonomous driving, satellite imagery sensing, and biomedical imaging use computer vision as an important tool for perception tasks. For Intelligent Transportation Systems (ITS), it is required to precisely recognize and locate scenes in sensor data. Semantic segmentation is one of computer vision methods intended to perform such tasks. However, the existing semantic segmentation tasks label each pixel with a single object's class. Recognizing object attributes, e.g., pedestrian orientation, will be more informative and help for a better scene understanding. Thus, we propose a method to perform semantic segmentation with pedestrian attribute recognition simultaneously. We introduce an attribute-aware loss function that can be applied to an arbitrary base model. Furthermore, a re-annotation to the existing Cityscapes dataset enriches the ground-truth labels by annotating the attributes of pedestrian orientation. We implement the proposed method and compare the experimental results with others. The attribute-aware semantic segmentation shows the ability to outperform baseline methods both in the traditional object segmentation task and the expanded attribute detection task.
Rui SUN Huihui WANG Jun ZHANG Xudong ZHANG
As a research hotspot and difficulty in the field of computer vision, pedestrian detection has been widely used in intelligent driving and traffic monitoring. The popular detection method at present uses region proposal network (RPN) to generate candidate regions, and then classifies the regions. But the RPN produces many erroneous candidate areas, causing region proposals for false positives to increase. This letter uses improved residual attention network to capture the visual attention map of images, then normalized to get the attention score map. The attention score map is used to guide the RPN network to generate more precise candidate regions containing potential target objects. The region proposals, confidence scores, and features generated by the RPN are used to train a cascaded boosted forest classifier to obtain the final results. The experimental results show that our proposed approach achieves highly competitive results on the Caltech and ETH datasets.
You-Sun WON Dongseung SHIN Miryong PARK Sohee JUNG Jaeho LEE Cheolhyo LEE Yunjeong SONG
This paper reports a 24GHz ISM band radar module for pedestrian detection in crosswalks. The radar module is composed of an RF transceiver board, a baseband board, and a microcontroller unit board. The radar signal is a sawtooth frequency-modulated continuous-wave signal with a center frequency of 24.15GHz, a bandwidth of 200MHz, a chirp length of 80µs, and a pulse repetition interval of 320µs. The radar module can detect a pedestrian on a crosswalk with a width of 4m and a length of 14m. The radar outputs the range, angle, and speed of the detected pedestrians every 50ms by radar signal processing and consumes 7.57W from 12V power supply. The size of the radar module is 110×70mm2.
Yuyang HUANG Li-Ta HSU Yanlei GU Shunsuke KAMIJO
Accurate pedestrian navigation remains a challenge in urban environments. GNSS receiver behaves poorly because the reflection and blockage of the GNSS signals by buildings or other obstacles. Integration of GNSS positioning and Pedestrian Dead Reckoning (PDR) could provide a more smooth navigation trajectory. However, the integration system cannot present the satisfied performance if GNSS positioning has large error. This situation often happens in the urban scenario. This paper focuses on improving the accuracy of the pedestrian navigation in urban environment using a proposed altitude map aided GNSS positioning method. Firstly, we use consistency check algorithm, which is similar to receiver autonomous integrity monitoring (RAIM) fault detection, to distinguish healthy and multipath contaminated measurements. Afterwards, the erroneous signals are corrected with the help of an altitude map. We called the proposed method altitude map aided GNSS. After correcting the erroneous satellite signals, the positioning mean error could be reduced from 17 meters to 12 meters. Usually, good performance for integration system needs accurately calculated GNSS accuracy value. However, the conventional GNSS accuracy calculation is not reliable in urban canyon. In this paper, the altitude map is also utilized to calculate the GNSS localization accuracy in order to indicate the reliability of the estimated position solution. The altitude map aided GNSS and accuracy are used in the integration with PDR system in order to provide more accurate and continuous positioning results. With the help of the proposed GNSS accuracy, the integration system could achieve 6.5 meters horizontal positioning accuracy in urban environment.
Yuki IMAEDA Takatsugu HIRAYAMA Yasutomo KAWANISHI Daisuke DEGUCHI Ichiro IDE Hiroshi MURASE
We propose an estimation method of pedestrian detectability considering the driver's visual adaptation to drastic illumination change, which has not been studied in previous works. We assume that driver's visual characteristics change in proportion to the elapsed time after illumination change. In this paper, as a solution, we construct multiple estimators corresponding to different elapsed periods, and estimate the detectability by switching them according to the elapsed period. To evaluate the proposed method, we construct an experimental setup to present a participant with illumination changes and conduct a preliminary simulated experiment to measure and estimate the pedestrian detectability according to the elapsed period. Results show that the proposed method can actually estimate the detectability accurately after a drastic illumination change.
Hiroshi FUKUI Takayoshi YAMASHITA Yuji YAMAUCHI Hironobu FUJIYOSHI Hiroshi MURASE
Pedestrian attribute information is important function for an advanced driver assistance system (ADAS). Pedestrian attributes such as body pose, face orientation and open umbrella indicate the intended action or state of the pedestrian. Generally, this information is recognized using independent classifiers for each task. Performing all of these separate tasks is too time-consuming at the testing stage. In addition, the processing time increases with increasing number of tasks. To address this problem, multi-task learning or heterogeneous learning is performed to train a single classifier to perform multiple tasks. In particular, heterogeneous learning is able to simultaneously train a classifier to perform regression and recognition tasks, which reduces both training and testing time. However, heterogeneous learning tends to result in a lower accuracy rate for classes with few training samples. In this paper, we propose a method to improve the performance of heterogeneous learning for such classes. We introduce a rarity rate based on the importance and class probability of each task. The appropriate rarity rate is assigned to each training sample. Thus, the samples in a mini-batch for training a deep convolutional neural network are augmented according to this rarity rate to focus on the classes with a few samples. Our heterogeneous learning approach with the rarity rate performs pedestrian attribute recognition better, especially for classes representing few training samples.
Ming XU Xiaosheng YU Chengdong WU Dongyue CHEN
A robust pedestrian detection approach in thermal infrared imageries for an all-day surveillance is proposed. Firstly, the candidate regions which are likely to contain pedestrians are extracted based on a saliency detection method. Then a deep convolutional network with a multi-task loss is constructed to recognize the pedestrians. The experimental results show the superiority of the proposed approach in pedestrian detection.
Yohei NAKAZAWA Hideo MAKINO Kentaro NISHIMORI Daisuke WAKATSUKI Makoto KOBAYASHI Hideki KOMAGATA
In this paper, we propose a precise indoor localization method using visible light communication (VLC) with dual-facing cameras on a smart device (mobile phone, smartphone, or tablet device). This approach can assist the visually impaired with navigation, or provide mobile-robot control. The proposed method is different from conventional techniques in that dual-facing cameras are used to expand the localization area. The smart device is used as the receiver, and light-emitting diodes on the ceiling are used as localization landmarks. These are identified by VLC using a rolling shutter effect of complementary metal-oxide semiconductor image sensors. The front-facing camera captures the direct incident light of the landmarks, while the rear-facing camera captures mirror images of landmarks reflected from the floor face. We formulated the relationship between the poses (position and attitude) of the two cameras and the arrangement of landmarks using tilt detection by the smart device accelerometer. The equations can be analytically solved with a constant processing time, unlike conventional numerical methods, such as least-squares. We conducted a simulation and confirmed that the localization area was 75.6% using the dual-facing cameras, which was 3.8 times larger than that using only the front-facing camera. As a result of the experiment using two landmarks and a tablet device, the localization error in the horizontal direction was less than 98 mm at 90% of the measurement points. Moreover, the error estimation index can be used for appropriate route selection for pedestrians.
Siya BAO Tomoyuki NITTA Masao YANAGISAWA Nozomu TOGAWA
In this paper, we propose a safe and comprehensive route finding algorithm for pedestrians based on lighting and landmark conditions. Safety and comprehensiveness can be predicted by the five possible indicators: (1) lighting conditions, (2) landmark visibility, (3) landmark effectiveness, (4) turning counts along a route, and (5) road widths. We first investigate impacts of these five indicators on pedestrians' perceptions on safety and comprehensiveness during route findings. After that, a route finding algorithm is proposed for pedestrians. In the algorithm, we design the score based on the indicators (1), (2), (3), and (5) above and also introduce a turning count reduction strategy for the indicator (4). Thus we find out a safe and comprehensive route through them. In particular, we design daytime score and nighttime score differently and find out an appropriate route depending on the time periods. Experimental simulation results demonstrate that the proposed algorithm obtains higher scores compared to several existing algorithms. We also demonstrate that the proposed algorithm is able to find out safe and comprehensive routes for pedestrians in real environments in accordance with questionnaire results.
Weiwei XING Shibo ZHAO Shunli ZHANG Yuanyuan CAI
Crowd modeling and simulation is an active research field that has drawn increasing attention from industry, academia and government recently. In this paper, we present a generic data-driven approach to generate crowd behaviors that can match the video data. The proposed approach is a bi-layer model to simulate crowd behaviors in pedestrian traffic in terms of exclusion statistics, parallel dynamics and social psychology. The bottom layer models the microscopic collision avoidance behaviors, while the top one focuses on the macroscopic pedestrian behaviors. To validate its effectiveness, the approach is applied to generate collective behaviors and re-create scenarios in the Informatics Forum, the main building of the School of Informatics at the University of Edinburgh. The simulation results demonstrate that the proposed approach is able to generate desirable crowd behaviors and offer promising prediction performance.
Tetsuya MANABE Takaaki HASEGAWA
In this paper, the differences in navigation information design, which is important for kiosk-type pedestrian navigation systems, were experimentally examined depending on presence or absence of carriable navigation information in order to acquire the knowledge to contribute design guidelines of kiosk-type pedestrian navigation systems. In particular, we used route complexity information calculated using a regression equation that contained multiple factors. In the absence of carriable navigation information, both the destination arrival rate and route deviation rate improved. Easy routes were designed as M (17 to 39 characters in Japanese), while complicated routes were denoted as L (40 or more characters in Japanese). On the contrary, in the presence of carriable navigation information, the user's memory load was found to be reduced by carrying the same navigation information as kiosk-type terminals. Thus, the reconsideration of kiosk-type pedestrian navigation systems design, e.g., the means of presenting navigation information, is required. For example, if the system attaches importance to a high destination arrival rate, L_Carrying without regard to route complexity is better. If the system attaching importance to the low route deviation rate, M_Carrying in the case of easy routes and L_Carrying in the case of complicated routes have been better. Consequently, this paper presents the differences in the designs of pedestrian navigation systems depending on whether carriable navigation information is absent or present.
Tomotaka WADA Go NAKAGAMI Susumu KAWAI
We have developed Pedestrian-Vehicular Collision Avoidance Support System (P-VCASS) in order to protect pedestrians from traffic accidents and its effectiveness has been verified. P-VCASS is a system that takes into account pedestrian's moving situations. It gives warning to drivers of neighboring vehicles in advance if there is a possibility of collision between vehicles and pedestrians. There are pedestrians to move around. They are dangerous for vehicle drivers because they have high probability of running out into the road suddenly. Hence, we need to take into account the presence of them. In this paper, we propose a new estimation method of pedestrian's running out into road by using pressure sensor and moving record. We show the validity of the proposed system by experiments using a vehicle and a pedestrian terminal in the intersection. As a result, we show that a driver of vehicle is able to detect dangerous pedestrians quickly and accurately.
Kota IWANAGA Keiji JIMI Isamu MATSUNAMI
Case studies have reported that pedestrian detection methods using vehicle radar are not complete systems because each system has specific limitations at the cost of the calculating amounts, the system complexity or the range resolution. In this letter, we proposed a novel pedestrian detection method by template matching using Gabor filter bank, which was evaluated based on the data observed by 24GHz UWB radar.