Jung-Hwan KIM Kee-Bum KIM Sajjad Hussain CHAUHDARY Wencheng YANG Myong-Soon PARK
The proliferation of research on target detection and tracking in wireless sensor networks has kindled development of monitoring continuous objects such as fires and hazardous bio-chemical material diffusion. In this paper, we propose an energy-efficient algorithm that monitors a moving event region by selecting only a subset of nodes near object boundaries. The paper also shows that we can effectively reduce report message size. It is verified with performance analysis and simulation results that total average report message size as well as the number of nodes which transmit the report messages to the sink can be greatly reduced, especially when the density of nodes over the network field is high.
Yong Hun PARK Kyoung Soo BOK Jae Soo YOO
In this paper, we propose a continuous range query processing method over moving objects. To efficiently process continuous range queries, we design a main-memory-based query index that uses smaller storage and significantly reduces the query processing time. We show through performance evaluation that the proposed method outperforms the existing methods.
Yoshifumi CHISAKI Toshimichi TAKADA Masahiro NAGANISHI Tsuyoshi USAGAWA
The frequency domain binaural model (FDBM) has been previously proposed to localize multiple sound sources. Since the method requires only two input signals and uses interaural phase and level differences caused by the diffraction generated by the head, flexibility in application is very high when the head is considered as an object. When an object is symmetric with respect to the two microphones, the performance of sound source localization is degraded, as a human being has front-back confusion due to the symmetry in a median plane. This paper proposes to reduce the degradation of performance on sound source localization by a combination of the microphone pair outputs using the FDBM. The proposed method is evaluated by applying to a security camera system, and the results showed performance improvement in sound source localization because of reducing the number of cones of confusion.
In this paper, a new lifting-based shape-direction-adaptive discrete wavelet transform (SDA-DWT) which can be used for arbitrarily shaped segments is proposed. The SDA-DWT contains three major techniques: the lifting-based DWT, the adaptive directional technique, and the concept of object-based compression in MPEG-4. With SDA-DWT, the number of transformed coefficients is equal to the number of pixels in the arbitrarily shaped segment image, and the spatial correlation across subbands is well preserved. SDA-DWT also can locally adapt its filtering directions according to the texture orientations to improve energy compaction for images containing non-horizontal or non-vertical edge textures. SDA-DWT can be applied to any application that is wavelet based and the lifting technique provides much flexibility for hardware implementation. Experimental results show that, for still object images with rich orientation textures, SDA-DWT outperforms SA-DWT up to 5.88 dB in PSNR under 2.15-bpp (bit / object pixel) condition, and reduces the bit-budget up to 28.5% for lossless compression. SDA-DWT also outperforms DA-DWT up to 5.44 dB in PSNR under 3.28-bpp condition, and reduces the bit-budget up to 14.0%.
Ryoji HASHIMOTO Tomoya MATSUMURA Yoshihiro NOZATO Kenji WATANABE Takao ONOYE
A multi-agent object attention system is proposed, which is based on biologically inspired attractor selection model. Object attention is facilitated by using a video sequence and a depth map obtained through a compound-eye image sensor TOMBO. Robustness of the multi-agent system over environmental changes is enhanced by utilizing the biological model of adaptive response by attractor selection. To implement the proposed system, an efficient VLSI architecture is employed with reducing enormous computational costs and memory accesses required for depth map processing and multi-agent attractor selection process. According to the FPGA implementation result of the proposed object attention system, which is accomplished by using 7,063 slices, 640512 pixel input images can be processed in real-time with three agents at a rate of 9 fps in 48 MHz operation.
The steady approach of advanced nations toward realization of ubiquitous computing societies has given birth to rapidly growing demands for new-generation distributed computing (DC) applications. Consequently, economic and reliable construction of new-generation DC applications is currently a major issue faced by the software technology research community. What is needed is a new-generation DC software engineering technology which is at least multiple times more effective in constructing new-generation DC applications than the currently practiced technologies are. In particular, this author believes that a new-generation building-block (BB), which is much more advanced than the current-generation DC object that is a small extension of the object model embedded in languages C++, Java, and C#, is needed. Such a BB should enable systematic and economic construction of DC applications that are capable of taking critical actions with 100-microsecond-level or even 10-microsecond-level timing accuracy, fault tolerance, and security enforcement while being easily expandable and taking advantage of all sorts of network connectivity. Some directions considered worth pursuing for finding such BBs are discussed.
Three features for image classification into natural objects and artifacts are investigated. They are 'line length ratio', 'line direction distribution,' and 'edge coverage'. Among the three, the feature 'line length ratio' shows superior classification accuracy (above 90%) that exceeds the performance of conventional features, according to experimental results in application to digital camera images. As the development of this feature was motivated by the fact that the edge sharpening magnitude in image-quality improvement must be controlled based on the image content, this classification algorithm should be especially suitable for the image-quality improvement applications.
Akinori HIDAKA Kenji NISHIDA Takio KURITA
In this paper, we propose a novel classifier-based object tracker. Our tracker is the combination of Rectangle Feature (RF) based detector [17],[18] and optical-flow based tracking method [1]. We show that the gradient of extended RFs can be calculated rapidly by using Integral Image method. The proposed tracker was tested on real video sequences. We applied our tracker for face tracking and car tracking experiments. Our tracker worked over 100 fps while maintaining comparable accuracy to RF based detector. Our tracking routine that does not contain image I/O processing can be performed about 500 to 2,500 fps with sufficient tracking accuracy.
In this paper, we introduce a method for recognizing a subject complex object in real world environment. We use a three dimensional model described by line segments of the object and the data provided by a three-axis orientation sensor attached to the video camera. We assume that existing methods for finding line features in the image allow at least one model line segment to be detected as a single continuous segment. The method consists of two main steps: generation of pose hypotheses and then evaluation of each pose in order to select the most appropriate one. The first stage is three-fold: model visibility, line matching and pose estimation; the second stage aims to rank the poses by evaluating the similarity between the projected model lines and the image lines. Furthermore, we propose an additional step that consists of refining the best candidate pose by using the Lie group formalism of spatial rigid motions. Such a formalism provides an efficient local parameterization of the set of rigid rotation via the exponential map. A set of experiments demonstrating the robustness of this approach is presented.
Daisuke ABE Eigo SEGAWA Osafumi NAKAYAMA Morito SHIOHARA Shigeru SASAKI Nobuyuki SUGANO Hajime KANNO
In this paper, we present a robust small-object detection method, which we call "Frequency Pattern Emphasis Subtraction (FPES)", for wide-area surveillance such as that of harbors, rivers, and plant premises. For achieving robust detection under changes in environmental conditions, such as illuminance level, weather, and camera vibration, our method distinguishes target objects from background and noise based on the differences in frequency components between them. The evaluation results demonstrate that our method detected more than 95% of target objects in the images of large surveillance areas ranging from 30-75 meters at their center.
Kyoung Soo BOK Ho Won YOON Dong Min SEO Myoung Ho KIM Jae Soo YOO
In this paper, a new access method is proposed for current positions of moving objects on road networks in order to efficiently update their positions. In the existing index structures, the connectivity of edges is lost because the intersection points in which three or more edges are split. The proposed index structure preserves the network connectivity, which uses intersection oriented network model by not splitting intersection nodes that three or more edges meet for preserving the connectivity of adjacent road segments. The data node stores not only the positions of moving object but also the connectivity of networks.
Viet-Quoc PHAM Takashi MIYAKI Toshihiko YAMASAKI Kiyoharu AIZAWA
We present a robust object-based watermarking algorithm using the scale-invariant feature transform (SIFT) in conjunction with a data embedding method based on Discrete Cosine Transform (DCT). The message is embedded in the DCT domain of randomly generated blocks in the selected object region. To recognize the object region after being distorted, its SIFT features are registered in advance. In the detection scheme, we extract SIFT features from the distorted image and match them with the registered ones. Then we recover the distorted object region based on the transformation parameters obtained from the matching result using SIFT, and the watermarked message can be detected. Experimental results demonstrated that our proposed algorithm is very robust to distortions such as JPEG compression, scaling, rotation, shearing, aspect ratio change, and image filtering.
Al MANSUR Katsutoshi SAKATA Dipankar DAS Yoshinori KUNO
Conventional interest point based matching requires computationally expensive patch preprocessing and is not appropriate for recognition of plain objects with negligible detail. This paper presents a method for extracting distinctive interest regions from images that can be used to perform reliable matching between different views of plain objects or scene. We formulate the correspondence problem in a Naive Bayesian classification framework and a simple correlation based matching, which makes our system fast, simple, efficient, and robust. To facilitate the matching using a very small number of interest regions, we also propose a method to reduce the search area inside a test scene. Using this method, it is possible to robustly identify objects among clutter and occlusion while achieving near real-time performance. Our system performs remarkably well on plain objects where some state-of-the art methods fail. Since our system is particularly suitable for the recognition of plain object, we refer to it as Simple Plane Object Recognizer (SPOR).
Subjects' episodic memory performance is not simply reflected by eye movements. We use a 'theta phase coding' model of the hippocampus to predict subjects' memory performance from their eye movements. Results demonstrate the ability of the model to predict subjects' memory performance. These studies provide a novel approach to computational modeling in the human-machine interface.
Service robots need to be able to recognize and identify objects located within complex backgrounds. Since no single method may work in every situation, several methods need to be combined and robots have to select the appropriate one automatically. In this paper we propose a scheme to classify situations depending on the characteristics of the object of interest and user demand. We classify situations into four groups and employ different techniques for each. We use Scale-invariant feature transform (SIFT), Kernel Principal Components Analysis (KPCA) in conjunction with Support Vector Machine (SVM) using intensity, color, and Gabor features for five object categories. We show that the use of appropriate features is important for the use of KPCA and SVM based techniques on different kinds of objects. Through experiments we show that by using our categorization scheme a service robot can select an appropriate feature and method, and considerably improve its recognition performance. Yet, recognition is not perfect. Thus, we propose to combine the autonomous method with an interactive method that allows the robot to recognize the user request for a specific object and class when the robot fails to recognize the object. We also propose an interactive way to update the object model that is used to recognize an object upon failure in conjunction with the user's feedback.
Noritsugu EGI Hitoshi AOKI Akira TAKAHASHI
We present a method for the objective quality evaluation of noise-reduced speech in wideband speech communication services, which utilize speech with a wider bandwidth (e.g., 7 kHz) than the usual telephone bandwidth. Experiments indicate that the amount of residual noise and the distortion of speech and noise, which are quality factors, influence the perceived quality degradation of noise-reduced speech. From the results, we observe the principal relationships between these quality factors and perceived speech quality. On the basis of these relationships, we propose a method that quantifies each quality factor in noise-reduced speech by analyzing signals that can be measured and assesses the overall perceived quality of noise-reduced speech using values of these quality factors. To verify the validity of the method, we perform a subjective listening test and compare subjective quality of noise-reduced speech with its estimation. In the test, we use various types of background noise and noise-reduction algorithms. The verification results indicate that the correlation between subjective quality and its objective estimation is sufficiently high regardless of the type of background noise and noise-reduction algorithm.
Hernan AGUIRRE Masahiko SATO Kiyoshi TANAKA
In this paper, we propose δ-similar elimination to improve the search performance of multiobjective evolutionary algorithms in combinatorial optimization problems. This method eliminates similar individuals in objective space to fairly distribute selection among the different regions of the instantaneous Pareto front. We investigate four eliminating methods analyzing their effects using NSGA-II. In addition, we compare the search performance of NSGA-II enhanced by our method and NSGA-II enhanced by controlled elitism.
Lina Tomokazu TAKAHASHI Ichiro IDE Hiroshi MURASE
We propose the construction of an appearance manifold with embedded view-dependent covariance matrix to recognize 3D objects which are influenced by geometric distortions and quality degradation effects. The appearance manifold is used to capture the pose variability, while the covariance matrix is used to learn the distribution of samples for gaining noise-invariance. However, since the appearance of an object in the captured image is different for every different pose, the covariance matrix value is also different for every pose position. Therefore, it is important to embed view-dependent covariance matrices in the manifold of an object. We propose two models of constructing an appearance manifold with view-dependent covariance matrix, called the View-dependent Covariance matrix by training-Point Interpolation (VCPI) and View-dependent Covariance matrix by Eigenvector Interpolation (VCEI) methods. Here, the embedded view-dependent covariance matrix of the VCPI method is obtained by interpolating every training-points from one pose to other training-points in a consecutive pose. Meanwhile, in the VCEI method, the embedded view-dependent covariance matrix is obtained by interpolating only the eigenvectors and eigenvalues without considering the correspondences of each training image. As it embeds the covariance matrix in manifold, our view-dependent covariance matrix methods are robust to any pose changes and are also noise invariant. Our main goal is to construct a robust and efficient manifold with embedded view-dependent covariance matrix for recognizing objects from images which are influenced with various degradation effects.
In this paper we propose an efficient line feature-based 2D object recognition algorithm using a novel entropy correspondence measure (ECM) that encodes the probabilistic similarity between two line feature sets. Since the proposed ECM-based method uses the whole structural information of objects simultaneously for matching, it overcomes the common drawbacks of the conventional techniques that are based on feature to feature correspondence. Moreover, since ECM is endowed with probabilistic attribute, it shows quite robust performance in the noisy environment. In order to enhance the recognition performance and speed, line features are pre-clustered into several groups according to their inclination by an eigen analysis, and then ECM is applied to each corresponding group individually. Experimental results on real images demonstrate that the proposed algorithm has superior performance to those of the conventional algorithms in both the accuracy and the computational efficiency, in the noisy environment.
Akira TAKAHASHI Noritsugu EGI Atsuko KURASHIMA
VoIP is one of the key technologies for recent telecommunication services. In addition to the migration from the conventional PSTN to IP networks, mobile networks will follow the PSTN in moving to an IP-based infrastructure. Due to limited radio resources, the speech bitrate in mobile networks must be more strongly compressed than is true in PSTN. This will lead to a heterogeneous network environment, in which different speech codecs are employed in fixed and mobile networks. Therefore, from the viewpoint of designing and managing the QoE (Quality of Experience) of end-to-end telephony services, establishing a method to evaluate the quality of VoIP in such a heterogeneous network environment is very important. The quality of speech communication services should be discussed in subjective terms. Subjective quality assessment is time-consuming and expensive, however, so objective quality assessment which estimates subjective quality without carrying out subjective quality experiments is desirable. To establish an objective method to evaluate the end-to-end quality of speech in a heterogeneous network environment, this paper proposes a method for estimating the end-to-end listening quality based on the quality in each individual segment. This method is very important because conventional technologies such as the E-model, which was standardized as ITU-T Recommendation G.107, cannot accurately estimate overall quality based on segmental qualities. The experimentals show that the proposed method offers better performance in terms of quality estimation than the conventional method.