1-16hit |
Abraham MONRROY CANO Eijiro TAKEUCHI Shinpei KATO Masato EDAHIRO
We present an accurate and easy-to-use multi-sensor fusion toolbox for autonomous vehicles. It includes a ‘target-less’ multi-LiDAR (Light Detection and Ranging), and Camera-LiDAR calibration, sensor fusion, and a fast and accurate point cloud ground classifier. Our calibration methods do not require complex setup procedures, and once the sensors are calibrated, our framework eases the fusion of multiple point clouds, and cameras. In addition we present an original real-time ground-obstacle classifier, which runs on the CPU, and is designed to be used with any type and number of LiDARs. Evaluation results on the KITTI dataset confirm that our calibration method has comparable accuracy with other state-of-the-art contenders in the benchmark.
Wei LI Yi WU Chunlin SHEN Huajun GONG
We present a system to improve the robustness of real-time 3D surface reconstruction by utilizing non-inertial localization sensor. Benefiting from such sensor, our easy-to-build system can effectively avoid tracking drift and lost comparing with conventional dense tracking and mapping systems. To best fusing the sensor, we first adopt a hand-eye calibration and performance analysis for our setup and then propose a novel optimization framework based on adaptive criterion function to improve the robustness as well as accuracy. We apply our system to several challenging reconstruction tasks, which show significant improvement in scanning robustness and reconstruction quality.
Motofumi NAKANISHI Shintaro IZUMI Mio TSUKAHARA Hiroshi KAWAGUCHI Hiromitsu KIMURA Kyoji MARUMOTO Takaaki FUCHIKAMI Yoshikazu FUJIMORI Masahiko YOSHIMOTO
This paper presents an algorithm for a physical activity (PA) classification and metabolic equivalents (METs) monitoring and its System-on-a-Chip (SoC) implementation to realize both power reduction and high estimation accuracy. Long-term PA monitoring is an effective means of preventing lifestyle-related diseases. Low power consumption and long battery life are key features supporting the wider dissemination of the monitoring system. As described herein, an adaptive sampling method is implemented for longer battery life by minimizing the active rate of acceleration without decreasing accuracy. Furthermore, advanced PA classification using both the heart rate and acceleration is introduced. The proposed algorithms are evaluated by experimentation with eight subjects in actual conditions. Evaluation results show that the root mean square error with respect to the result of processing with fixed sampling rate is less than 0.22[METs], and the mean absolute error is less than 0.06[METs]. Furthermore, to minimize the system-level power dissipation, a dedicated SoC is implemented using 130-nm CMOS process with FeRAM. A non-volatile CPU using non-volatile memory and a flip-flop is used to reduce the stand-by power. The proposed algorithm, which is implemented using dedicated hardware, reduces the active rate of the CPU and accelerometer. The current consumption of the SoC is less than 3-µA. And the evaluation system using the test chip achieves 74% system-level power reduction. The total current consumption including that of the accelerometer is 11.3-µA on average.
Vijay JOHN Qian LONG Yuquan XU Zheng LIU Seiichi MITA
Environment perception is an important task for intelligent vehicles applications. Typically, multiple sensors with different characteristics are employed to perceive the environment. To robustly perceive the environment, the information from the different sensors are often integrated or fused. In this article, we propose to perform the sensor fusion and registration of the LIDAR and stereo camera using the particle swarm optimization algorithm, without the aid of any external calibration objects. The proposed algorithm automatically calibrates the sensors and registers the LIDAR range image with the stereo depth image. The registered LIDAR range image functions as the disparity map for the stereo disparity estimation and results in an effective sensor fusion mechanism. Additionally, we perform the image denoising using the modified non-local means filter on the input image during the stereo disparity estimation to improve the robustness, especially at night time. To evaluate our proposed algorithm, the calibration and registration algorithm is compared with baseline algorithms on multiple datasets acquired with varying illuminations. Compared to the baseline algorithms, we show that our proposed algorithm demonstrates better accuracy. We also demonstrate that integrating the LIDAR range image within the stereo's disparity estimation results in an improved disparity map with significant reduction in the computational complexity.
This paper presents a method to accelerate target recognition processing in advanced driver assistance systems (ADAS). A histogram of oriented gradients (HOG) is an effective descriptor for object recognition in computer vision and image processing. The HOG is expected to replace conventional descriptors, e.g., template-matching, in ADAS. However, the HOG does not consider the occurrences of gradient orientation on objects when localized portions of an image, i.e., a region of interest (ROI), are not set precisely. The size and position of the ROI should be set precisely for each frame in an automotive environment where the target distance changes dynamically. We use radar to determine the size and position of the ROI in a HOG and propose a radar and camera sensor fusion algorithm. Experimental results are discussed.
Processing structures required in sensing are designed to convert real-world information into useful information, and there are various restrictions and performance goals depending on physical restrictions and the target applications. On the other hand, network technologies are mainly designed for data exchange in the information world, as is seen in packet communications, and do not go well with sensing structures from the viewpoints of real-time properties, spatial continuity, etc. This indicates the need for understanding the architectures and restrictions of sensor technologies and network technologies when aiming to fuse these technologies. This paper clarifies the differences between these processing structures, proposes some issues to be addressed in order to achieve real fusion of them, and presents future directions toward real fusion of sensor technologies and network technologies.
Nga-Viet NGUYEN Georgy SHEVLYAKOV Vladimir SHIN
To solve the problem of distributed multisensor fusion, the optimal linear methods can be used in Gaussian noise models. In practice, channel noise distributions are usually non-Gaussian, possibly heavy-tailed, making linear methods fail. By combining a classical tool of optimal linear fusion and a robust statistical method, the two-stage MAD robust fusion (MADRF) algorithm is proposed. It effectively performs both in symmetrically and asymmetrically contaminated Gaussian channel noise with contamination parameters varying over a wide range.
Eigo SEGAWA Morito SHIOHARA Shigeru SASAKI Norio HASHIGUCHI Tomonobu TAKASHIMA Masatoshi TOHNO
We developed a system that detects the vehicle driving immediately ahead of one's own car in the same lane and measures the distance to and relative speed of that vehicle to prevent accidents such as rear-end collisions. The system is the first in the industry to use non-scanning millimeter-wave radar combined with a sturdy stereo image sensor, which keeps cost low. It can operate stably in adverse weather conditions such as rain, which could not easily be done with previous sensors. The system's vehicle detection performance was tested, and the system can correctly detect vehicles driving 3 to 50 m ahead in the same lane with higher than 99% accuracy in clear weather. Detection performance in rainy weather, where water drops and splashes notably degraded visibility, was higher than 90%.
Koji NISHIMURA Toru SATO Takuji NAKAMURA Masayoshi UEDA
In order to assess the possible impacts of meteors with spacecraft, which is among major hazard in the space environment, it is essential to establish an accurate statistics of their mass and velocity. We developed a radar-optical combined system for detecting faint meteors consisting of a powerful VHF Doppler radar and an ICCD video camera. The Doppler pulse compression scheme is used to enhance the S/N ratio of the radar echoes with very large Doppler shifts, as well as to determine their range with a resolution of 200 m. A very high sensitivity of more than 14 magnitude and 9 magnitude for radar and optical sensors, respectively, has been obtained. Instantaneous direction of meteor body observed by the radar is determined with the interferometry technique. We examined the optimum way of the receiving antenna arrangements, and also of the signal processing. Its absolute accuracy was confirmed by the optical observations with background stars as a reference. By combining the impinging velocity of meteor bodies derived by the radar with the absolute visual magnitude determined by the video camera simultaneously, the mass of each meteor body was estimated. The developed observation system will be used to create a valuable data base of the mass and velocity information of faint meteors, on which very little is known so far. The data base is expected to play a vital role in our understanding of the space environment needed for designing large space structures.
Yoshiaki SHIRAI Tsuyoshi YAMANE Ryuzo OKADA
This paper describes methods of tracking of moving objects in a cluttered background by integrating optical flow, depth data, and/or uniform brightness regions. First, a basic method is introduced which extracts a region with uniform optical flow as the target region. Then an extended method is described in which optical flow and depth are fused. A target region is extracted by Baysian inference in term of optical flow, depth and the predicted target location. This method works only for textured objects because optical flow or depth are extracted for textured objects. In order to solve this problem, uniform regions in addition to the optical flow are used for tracking. Realtime human tracking is realized for real image sequences by using a real time processor with multiple DSPs.
Terence Chek Hion HENG Yoshinori KUNO Yoshiaki SHIRAI
Presently, mobile robots are navigated by means of a number of methods, using navigating systems such as the sonar-sensing system or the visual-sensing system. These systems each have their strengths and weaknesses. For example, although the visual system enables a rich input of data from the surrounding environment, allowing an accurate perception of the area, processing of the images invariably takes time. The sonar system, on the other hand, though quicker in response, is limited in terms of quality, accuracy and range of data. Therefore, any navigation methods that involves only any one system as the primary source for navigation, will result in the incompetency of the robot to navigate efficiently in a foreign, slightly-more-complicated-than-usual surrounding. Of course, this is not acceptable if robots are to work harmoniously with humans in a normal office/laboratory environment. Thus, to fully utilise the strengths of both the sonar and visual sensing systems, this paper proposes a fusion of navigating methods involving both the sonar and visual systems as primary sources to produce a fast, efficient and reliable obstacle-avoiding and navigating system. Furthermore, to further enhance a better perception of the surroundings and to improve the navigation capabilities of the mobile robot, active sensing modules are also included. The result is an active sensor fusion system for the collision avoiding behaviour of mobile robots. This behaviour can then be incorporated into other purposive behaviours (eg. Goal Seeking, Path Finding, etc. ). The validity of this system is also shown in real robot experiments.
In the field of speech recognition, many researchers have proposed speech recognition methods using auditory information like acoustic signal or visual information like shape and motion of lips. Auditory information has valid features for speech recognition, but it is difficult to accomplish speech recognition in noisy environment. On the other side, visual information has advantage to accomplish speech recognition in noisy environment, but it is difficult to extract effective features for speech recognition. Thus, in case of using either auditory information or visual information, it is difficult to accomplish speech recognition perfectly. In this paper, we propose a method to fuse auditory information and visual information in order to realize more accurate speech recognition. The proposed method consists of two processes: (1) two probabilities for auditory information and visual information are calculated by HMM, (2) these probabilities are fused by using linear combination. We have performed speech recognition experiments of isolated words, whose auditory information (22.05kHz sampling, 8-bit quantization) and visual information (30-frame/s sampling, 24-bit quantization) are captured with multi-media personal computer, and have confirmed the validity of the proposed method.
Takeshi NAGASAKI Toshio KAWASHIMA Yoshinao AOKI
In this paper, we propose a method to construct structure models of articulated objects from multiple local observations of their motion using state transition analysis of local geometric constraints. The object model is constructed by a bottom-up approach with three levels. Each level groups sensor data with a constraint among local features observed by the sensor, and constructs the local model. If the sensor data in current model conflict, the model is reconstructed. In each level, the first level estimates a local geometric feature from the local sensor data (eg. edge, feature point) The second level estimates a rigid body from the local geometric feature. The third level estimates an object from the rigid bodies. In the third level, the constraint between rigid bodies is estimated by transition states, which are motions between rigid bodies. This approach is implemented on a blackboard system.
Satoru IGAWA Akio OGIHARA Akira SHINTANI Shinobu TAKAMATSU
We propose a method to fuse auditory information and visual information for accurate speech recognition. This method fuses two kinds of information by using Iinear combination after calculating two kinds of probabilities by HMM for each word. In addition, we use full-frame color image as visual information in order to improve the accuracy of the proposed speech recognition system. We have performed experiments comparing the proposed method with the method using either auditory information or visual information, and confirmed the validity of the proposed method.
Akira SHINTANI Akio OGIHARA Yoshikazu YAMAGUCHI Yasuhisa HAYASHI Kunio FUKUNAGA
We propose two methods to fuse auditory information and visual information for accurate sppech recognition. The first method fuses two kinds of information by using linear combination after calculating two kinds of probabilities by HMM for each word. The second method fuses two kinds of information by using the histogram which expresses the correlation of them. We have performed experiments comparing the proposed methods with the conventional method and confirmed the validity of the proposed methods.
Akira OKAMOTO Yoshiaki SHIRAI Minoru ASADA
This paper describes a method for describing a three-dimensional (3-D) scene by integrating color and range data. Range data is obtained by a feature-based stereo method developed in our laboratory. A color image is segmented into uniform color regions. A plane is fitted to the range data inside a segmented region. Regions are classified into three types based on the range data. A certain types of regions are merged and the others remain unless the region type is modified. The region type is modified if the range data on a plane are selected by removing of the some range data. As a result, the scene is represented by planar surfaces with homogeneous colors. Experimental results for real scenes are shown.