Shingo MANDAI Toru NAKURA Makoto IKEDA Kunihiro ASADA
This paper presents a multi functional range finder employing dual imager core on a single chip. Each imager core has functionalities of 2-D imaging and 3-D capture using the light section method with combinations of the dual imager core. The presented chip achieves, 2-D imaging mode, 3-D capture mode with the conventional light-section method, high-speed 3-D capture mode with the stereo matching mode, and 2-D and 3-D simultaneous capture mode. We demonstrate 58 fps 2-D imaging with 8 bit gray scale, and 24.8 rangemaps/s 3-D range-finder with the maximum range error of 1.619 mm and the standard deviation of 0.385 mm at 700 mm.
Fanny RAHADIAN Tatsuya MASADA Ichiro FUJIEDA
We propose to integrate a single lens on top of multiple OLEDs. Angular distribution of the light emitted from the lens surface is altered by turning on the OLEDs selectively. We can use such a light source as a backlight for a liquid crystal display to switch its viewing angle range and/or to display multiple images in different directions. Pixel-level integration would allow one to construct an OLED display with a similar emission angle control.
Yasuhiro KOBAYASHI Masanori HARIYAMA Michitaka KAMEYAMA
Hierarchical approaches using multi-resolution images are well-known techniques to reduce the computational amount without degrading quality. One major issue in designing image processors is to design a memory system that supports parallel access with a simple interconnection network. The complexity of the interconnection network mainly depends on memory allocation; it maps pixels onto memory modules and determines the required number of memory modules. This paper presents a memory allocation method to minimize the number of memory modules for image processing using multi-resolution images. For efficient search, the proposed method exploits the regularity of window-type image processing. A practical example demonstrates that the number of memory modules is reduced to less than 14% that of conventional methods.
Akihiro HAYASAKA Takuma SHIBAHARA Koichi ITO Takafumi AOKI Hiroshi NAKAJIMA Koji KOBAYASHI
This paper proposes a three-dimensional (3D) face recognition system using passive stereo vision. So far, the reported 3D face recognition techniques have used active 3D measurement methods to capture high-quality 3D facial information. However, active methods employ structured illumination (structure projection, phase shift, moire topography, etc.) or laser scanning, which is not desirable in many human recognition applications. Addressing this problem, we propose a face recognition system that uses (i) passive stereo vision to capture 3D facial information and (ii) 3D matching using an ICP (Iterative Closest Point) algorithm with its improvement techniques. Experimental evaluation demonstrates efficient recognition performance of the proposed system compared with an active 3D face recognition system and a passive 3D face recognition system employing the original ICP algorithm.
Takashi WATANABE Akira KUSANO Takayuki FUJIWARA Hiroyasu KOSHIMIZU
It is very important to guarantee the quality of the industrial products by means of visual inspection. In order to reduce the soldering defect with terminal deformation and terminal burr in the manufacturing process, this paper proposes a 3D visual inspection system based on a stereo vision with single camera. It is technically noted that the base line of this single camera stereo was precisely calibrated by the image processing procedure. Also to extract the measuring point coordinates for computing disparity; the error is reduced with original algorithm. Comparing its performance with that of human inspection using industrial microscope, the proposed 3D inspection could be an alternative in precision and in processing cost. Since the practical specification in 3D precision is less than 1 pixel and the experimental performance was around the same, it was demonstrated by the proposed system that the soldering defect with terminal deformation and terminal burr in inspection, especially in 3D inspection, was decreased. In order to realize the inline inspection, this paper will suggest how the human inspection of the products could be modeled and be implemented by the computer system especially in manufacturing process.
We present an interactive system for cosmetic makeup of a point-based face model acquired by 3D scanners. We first enhance the texture of a face model in 3D space using low-pass Gaussian filtering, median filtering, and histogram equalization. The user is provided with a stereoscopic display and haptic feedback, and can perform simulated makeup tasks including the application of foundation, color makeup, and lip gloss. Fast rendering is achieved by processing surfels using the GPU, and we use a BSP tree data structure and a dynamic local refinement of the facial surface to provide interactive haptics. We have implemented a prototype system and evaluated its performance.
Masanori HARIYAMA Naoto YOKOYAMA Michitaka KAMEYAMA
This paper presents a processor architecture for high-speed and reliable trinocular stereo matching based on adaptive window-size control of SAD (Sum of Absolute Differences) computation. To reduce its computational complexity, SADs are computed using images divided into non-overlapping regions, and the matching result is iteratively refined by reducing a window size. Window-parallel-and-pixel-parallel architecture is also proposed to achieve to fully exploit the potential parallelism of the algorithm. The architecture also reduces the complexity of an interconnection network between memory and functional units based on regularity of reference pixels. The stereo matching processor is designed in a 0.18 µm CMOS technology. The processing time is 83.2 µs@100 MHz. By using optimal scheduling, the increases in area and processing time is only 5% and 3% respectively compared to binocular stereo vision although the computational amount is double.
This paper proposes several cepstral statistics compensation and normalization algorithms which alleviate the effect of additive noise on cepstral features for speech recognition. The algorithms are simple yet efficient noise reduction techniques that use online-constructed pseudo-stereo codebooks to evaluate the statistics in both clean and noisy environments. The process yields transformations for both clean speech cepstra and noise-corrupted speech cepstra, or for noise-corrupted speech cepstra only, so that the statistics of the transformed speech cepstra are similar for both environments. Experimental results show that these codebook-based algorithms can provide significant performance gains compared to results obtained by using conventional utterance-based normalization approaches. The proposed codebook-based cesptral mean and variance normalization (C-CMVN), linear least squares (LLS) and quadratic least squares (QLS) outperform utterance-based CMVN (U-CMVN) by 26.03%, 22.72% and 27.48%, respectively, in relative word error rate reduction for experiments conducted on Test Set A of the Aurora-2 digit database.
Noriaki MURAKOSHI Akinori NISHIHARA
This paper presents a novel stereophonic acoustic echo canceling scheme without preprocessing. To accurately estimate echo path keeping the high level of performance in echo erasing, this scheme uses two filters, of which one filter is utilized as a guideline which does not erases echo but helps updating of the other filter, which actually erases echo. In addition, we propose a new filter dividing technique to apply to the filter divide scheme, and utilize this as the guideline. Numerical examples demonstrate that the proposed scheme improves the convergence behavior compared to conventional methods both in system mismatch (i.e., normalized coefficients error) and Echo Return Loss Enhancement (ERLE).
Hisanori NOTO Hirotsugu YAMAMOTO Yoshio HAYASAKI Syuji MUGURUMA Yoshifumi NAGAI Yoshinori SHIMIZU Nobuo NISHIDA
We have developed a stereoscopic large LED display with parallax barrier for use by the general public and stereoscopic cameras to show real world images in 3D. This paper aims to analyze stereoscopic camera separation and convergence angle to make the most use of a field of interest and the reproducible space provided by the large stereoscopic LED display. We describe the principle of a stereoscopic LED display with a parallax barrier and its reproducible space that is determined by the allowable range of disparity to fuse stereoscopic images. By using a model of stereoscopic imaging and display process, we introduce the formulas of the reproduced positions on our developed stereoscopic LED display. Furthermore, we analyze relationships between the stereoscopic camera separation, the convergence angle, the area of a field of interest, and the depth range of the reproduced space. The results show there are four categories in camera configurations: there are three kinds of camera configurations that have different characteristics and one configuration that is not recommended. Category A configuration reproduces a wide area of the field of interest in a long range of depth. Category B functions as a reduction of the field of interest. Category C functions as a magnification of the field of interest. In Category D, a narrow area of the field is reproduced in a short range of depth. In particular, for use by stereoscopic LED display with a rather low resolution, Category A and Category C are recommended because they fully use the reproducible positions.
Yuu TANAKA Atsushi YAMASHITA Toru KANEKO Kenjiro T. MIURA
In this paper, we propose a new method that can remove view-disturbing noises from stereo images. One of the thorny problems in outdoor surveillance by a camera is that adherent noises such as waterdrops on the protecting glass surface lens disturb the view from the camera. Therefore, we propose a method for removing adherent noises from stereo images taken with a stereo camera system. Our method is based on the stereo measurement and utilizes disparities between stereo image pair. Positions of noises in images can be detected by comparing disparities measured from stereo images with the distance between the stereo camera system and the glass surface. True disparities of image regions hidden by noises can be estimated from the property that disparities are generally similar with those around noises. Finally, we can remove noises from images by replacing the above regions with textures of corresponding image regions obtained by the disparity referring. Experimental results show the effectiveness of the proposed method.
Eigo SEGAWA Morito SHIOHARA Shigeru SASAKI Norio HASHIGUCHI Tomonobu TAKASHIMA Masatoshi TOHNO
We developed a system that detects the vehicle driving immediately ahead of one's own car in the same lane and measures the distance to and relative speed of that vehicle to prevent accidents such as rear-end collisions. The system is the first in the industry to use non-scanning millimeter-wave radar combined with a sturdy stereo image sensor, which keeps cost low. It can operate stably in adverse weather conditions such as rain, which could not easily be done with previous sensors. The system's vehicle detection performance was tested, and the system can correctly detect vehicles driving 3 to 50 m ahead in the same lane with higher than 99% accuracy in clear weather. Detection performance in rainy weather, where water drops and splashes notably degraded visibility, was higher than 90%.
Guang TIAN Feihu QI Masatoshi KIMACHI Yue WU Takashi IKETANI
This paper presents a 3D feature-based binocular tracking algorithm for tracking crowded people indoors. The algorithm consists of a two stage 3D feature points grouping method and a robust 3D feature-based tracking method. The two stage 3D feature points grouping method can use kernel-based ISODATA method to detect people accurately even though the part or almost full occlusion occurs among people in surveillance area. The robust 3D feature-based Tracking method combines interacting multiple model (IMM) method with a cascade multiple feature data association method. The robust 3D feature-based tracking method not only manages the generation and disappearance of a trajectory, but also can deal with the interaction of people and track people maneuvering. Experimental results demonstrate the robustness and efficiency of the proposed framework. It is real-time and not sensitive to the variable frame to frame interval time. It also can deal with the occlusion of people and do well in those cases that people rotate and wriggle.
Hotaka TAKIZAWA Shinji YAMAMOTO
In the present paper, we propose a method for reconstructing the surfaces of objects from stereo data. Both the fitness of stereo data to surfaces and interrelation between the surfaces are defined in the framework of a three-dimensional (3-D) Markov Random Field (MRF) model. The surface reconstruction is accomplished by searching for the most likely state of the MRF model. Three experimental results are shown for synthetic and real stereo data.
Nicolas HAUTIERE Raphael LABAYRADE Didier AUBERT
An atmospheric visibility measurement system capable of quantifying the most common operating range of onboard exteroceptive sensors is a key parameter in the creation of driving assistance systems. This information is then utilized to adapt sensor operations and processing or to alert the driver that the onboard assistance system is momentarily inoperative. Moreover, a system capable of either detecting the presence of fog or estimating visibility distances constitutes in itself a driving aid. In this paper, we first present a review of different optical sensors likely to measure the visibility distance. We then present our stereovision based technique to estimate what we call the "mobilized visibility distance". This is the distance to the most distant object on the road surface having a contrast above 5%. In fact, this definition is very close to the definition of the meteorological visibility distance proposed by the International Commission on Illumination (CIE). The method combines the computation of both a depth map of the vehicle environment using the "v-disparity" approach and of local contrasts above 5%. Both methods are described separately. Then, their combination is detailed. A qualitative evaluation is done using different video sequences. Finally, a static quantitative evaluation is also performed thanks to reference targets installed on a dedicated test site.
Ghader KARIMIAN Abolghasem A. RAIE Karim FAEZ
In this paper, a new stereo line segment matching algorithm is presented. The main purpose of this algorithm is to increase efficiency, i.e. increasing the number of correctly matched lines while avoiding the increase of mismatches. In this regard, the reasons for the elimination of correct matches as well as the existence of the erroneous ones in some existing algorithms have been investigated. An attempt was also made to make efficient uses of the photometric, geometric and structural information through the introduction of new constraints, criteria, and procedures. Hence, in the candidate determination stage of the designed algorithm two new constraints, in addition to the reliable epipolar, maximum and minimum disparity and orientation similarity constraints were employed. In the process of disambiguation and final matches selection, being the main problem of the matching issue, regarding the employed constraints, criterion function and its optimization, it is a completely new development. The algorithm was applied to the images of several indoor scenes and its high efficiency illustrated by correct matching of 96% of the line segments with no mismatches.
Akihiko SUGIYAMA Yann JONCOUR Akihiro HIRANO Takao NISHITANI Gerard FAUCON
A new stereo echo canceler with input slides and counter-lateralization is proposed. Convergence of filter coefficients to the correct echo paths is obtained by pre-processing which delays the input signal periodically by one sample in one of the two channels. The time difference between the two stereo components of the input signals causes a shift of the sound image. This shift is compensated for by presenting the delayed component of the stereo signals to a loudspeaker at a higher intensity, and the other component at a lower intensity. Correct echo-path identification is analytically shown in a more general form than in the preceding literatures. A subjective listening test shows that this method is more effective for vocal musics. The processed signals are scored 0.45 lower than the original input signals, using the ITU-R five-grade impairment scale.
Mohammad Abdul MUQUIT Takuma SHIBAHARA Takafumi AOKI
This paper presents a high-accuracy 3D (three-dimen-sional) measurement system using multi-camera passive stereo vision to reconstruct 3D surfaces of free form objects. The proposed system is based on an efficient stereo correspondence technique, which consists of (i) coarse-to-fine correspondence search, and (ii) outlier detection and correction, both employing phase-based image matching. The proposed sub-pixel correspondence search technique contributes to dense reconstruction of arbitrary-shaped 3D surfaces with high accuracy. The outlier detection and correction technique contributes to high reliability of reconstructed 3D points. Through a set of experiments, we show that the proposed system measures 3D surfaces of objects with sub-mm accuracy. Also, we demonstrate high-quality dense 3D reconstruction of a human face as a typical example of free form objects. The result suggests a potential possibility of our approach to be used in many computer vision applications.
This paper presents an approach that uses the Viterbi algorithm in a stereo correspondence problem. We propose a matching process which is visualized as a trellis diagram to find the maximum a posterior result. The matching process is divided into two parts: matching the left scene to the right scene and matching the right scene to the left scene. The last result of stereo problem is selected based on the minimum error for uniqueness by a comparison between the results of the two parts of matching process. This makes the stereo matching possible without explicitly detecting occlusions. Moreover, this stereo matching algorithm can improve the accuracy of the disparity image, and it has an acceptable running time for practical applications since it uses a trellis diagram iteratively and bi-directionally. The complexity of our proposed method is shown approximately as O(N2P), in which N is the number of disparity, and P is the length of the epipolar line in both the left and right images. Our proposed method has been proved to be robust when applied to well-known samples of stereo images such as random dot, Pentagon, Tsukuba image, etc. It provides a 95.7 percent of accuracy in radius 1 (differing by 1) for the Tsukuba images.
Masanori HARIYAMA Yasuhiro KOBAYASHI Haruka SASAKI Michitaka KAMEYAMA
This paper presents a processor architecture for high-speed and reliable stereo matching based on adaptive window-size control of SAD (Sum of Absolute Differences) computation. To reduce its computational complexity, SADs are computed using images divided into non-overlapping regions, and the matching result is iteratively refined by reducing a window size. Window-parallel-and-pixel-parallel architecture is also proposed to achieve to fully exploit the potential parallelism of the algorithm. The architecture also reduces the complexity of an interconnection network between memory and functional units based on the regularity of reference pixels. The stereo matching processor is implemented on an FPGA. Its performance is 80 times higher than that of a microprocessor (Pentium4@2 GHz), and is enough to generate a 3-D depth image at the video rate of 33 MHz.