Yulong XU Yang LI Jiabao WANG Zhuang MIAO Hang LI Yafei ZHANG
Feature extractor plays an important role in visual tracking, but most state-of-the-art methods employ the same feature representation in all scenes. Taking into account the diverseness, a tracker should choose different features according to the videos. In this work, we propose a novel feature adaptive correlation tracker, which decomposes the tracking task into translation and scale estimation. According to the luminance of the target, our approach automatically selects either hierarchical convolutional features or histogram of oriented gradient features in translation for varied scenarios. Furthermore, we employ a discriminative correlation filter to handle scale variations. Extensive experiments are performed on a large-scale benchmark challenging dataset. And the results show that the proposed algorithm outperforms state-of-the-art trackers in accuracy and robustness.
David WONG Daisuke DEGUCHI Ichiro IDE Hiroshi MURASE
Advances in intelligent vehicle systems have led to modern automobiles being able to aid drivers with tasks such as lane following and automatic braking. Such automated driving tasks increasingly require reliable ego-localization. Although there is a large number of sensors that can be employed for this purpose, the use of a single camera still remains one of the most appealing, but also one of the most challenging. GPS localization in urban environments may not be reliable enough for automated driving systems, and various combinations of range sensors and inertial navigation systems are often too complex and expensive for a consumer setup. Therefore accurate localization with a single camera is a desirable goal. In this paper we propose a method for vehicle localization using images captured from a single vehicle-mounted camera and a pre-constructed database. Image feature points are extracted, but the calculation of camera poses is not required — instead we make use of the feature points' scale. For image feature-based localization methods, matching of many features against candidate database images is time consuming, and database sizes can become large. Therefore, here we propose a method that constructs a database with pre-matched features of known good scale stability. This limits the number of unused and incorrectly matched features, and allows recording of the database scales into “tracklets”. These “Feature scale tracklets” are used for fast image match voting based on scale comparison with corresponding query image features. This process reduces the number of image-to-image matching iterations that need to be performed while improving the localization stability. We also present an analysis of the system performance using a dataset with high accuracy ground truth. We demonstrate robust vehicle positioning even in challenging lane change and real traffic situations.
In this paper a hardware-efficient local extrema detection (LED) method used for scale-space extrema detection in the SIFT algorithm is proposed. By reformulating the reuse of the intermediate results in taking the local maximum and minimum, the necessary operations in LED are reduced without degrading the detection accuracy. The proposed method requires 25% to 35% less logic resources than the conventional method when implemented in an FPGA with a slight increase in latency.
In this letter, we propose a method for obtaining a clear and natural output image by tuning the illumination component in an input image. The proposed method is based on the retinex process and it is suitable for the image quality improvement of images of which illumination is insufficient.
Saho YAGYU Akie SAKIYAMA Yuichi TANAKA
We propose an edge-preserving multiscale image decomposition method using filters for non-equispaced signals. It is inspired by the domain transform, which is a high-speed edge-preserving smoothing method, and it can be used in many image processing applications. One of the disadvantages of the domain transform is sensitivity to noise. Even though the proposed method is based on non-equispaced filters similar to the domain transform, it is robust to noise since it employs a multiscale decomposition. It uses the Laplacian pyramid scheme to decompose an input signal into the piecewise-smooth components and detail components. We design the filters by using an optimization based on edge-preserving smoothing with a conversion of signal distances and filters taking into account the distances between signal intervals. In addition, we also propose construction methods of filters for non-equispaced signals by using arbitrary continuous filters or graph spectral filters in order that various filters can be accommodated by the proposed method. As expected, we find that, similar to state-of-the-art edge-preserving smoothing techniques, including the domain transform, our approach can be used in many applications. We evaluated its effectiveness in edge-preserving smoothing of noise-free and noisy images, detail enhancement, pencil drawing, and stylization.
Kouichi GENDA Hiroshi YAMAMOTO Shohei KAMAMURA
When a massive network disruption occurs, repair of the damaged network takes time, and the recovery process involves multiple stages. We propose a fast and flow-controlled multi-stage network recovery method for determining the pareto-optimal recovery order of failed physical components reflecting the balance requirement between maximizing the total amount of traffic on all logical paths, called total network flow, and providing adequate logical path flows. The pareto-optimal problem is formulated by mixed integer linear programming (MILP). A heuristic algorithm, called the grouped-stage recovery (GSR), is also introduced to solve the problem when the problem formulated by MILP is computationally intractable in a large-scale failure. The effectiveness of the proposed method was numerically evaluated. The results show that the pareto-optimal recovery order can be determined from the balance between total network flow and adequate logical path flows, the allocated minimum bandwidth of the logical path can be drastically improved while maximizing total network flow, and the proposed method with GSR is applicable to large-scale failures because a nearly optimal recovery order with less than 10% difference rate can be determined within practical computation time.
Jun SONODA Keimei KAINO Motoyuki SATO
The finite-difference time-domain (FDTD) method has been widely used in recent years to analyze the propagation and scattering of electromagnetic waves. Because the FDTD method has second-order accuracy in space, its numerical dispersion error arises from truncated higher-order terms of the Taylor expansion. This error increases with the propagation distance in cases of large-scale analysis. The numerical dispersion error is expressed by a dispersion relation equation. It is difficult to solve this nonlinear equation which have many parameters. Consequently, a simple formula is necessary to substitute for the dispersion relation error. In this study, we have obtained a simple formula for the numerical dispersion error of 2-D and 3-D FDTD method in free space propagation.
Jiatian PI Keli HU Yuzhang GU Lei QU Fengrong LI Xiaolin ZHANG Yunlong ZHAN
Visual tracking has been studied for several decades but continues to draw significant attention because of its critical role in many applications. Recent years have seen greater interest in the use of correlation filters in visual tracking systems, owing to their extremely compelling results in different competitions and benchmarks. However, there is still a need to improve the overall tracking capability to counter various tracking issues, including large scale variation, occlusion, and deformation. This paper presents an appealing tracker with robust scale estimation, which can handle the problem of fixed template size in Kernelized Correlation Filter (KCF) tracker with no significant decrease in the speed. We apply the discriminative correlation filter for scale estimation as an independent part after finding the optimal translation based on the KCF tracker. Compared to an exhaustive scale space search scheme, our approach provides improved performance while being computationally efficient. In order to reveal the effectiveness of our approach, we use benchmark sequences annotated with 11 attributes to evaluate how well the tracker handles different attributes. Numerous experiments demonstrate that the proposed algorithm performs favorably against several state-of-the-art algorithms. Appealing results both in accuracy and robustness are also achieved on all 51 benchmark sequences, which proves the efficiency of our tracker.
Most unsupervised video segmentation algorithms are difficult to handle object extraction in dynamic real-world scenes with large displacements, as foreground hypothesis is often initialized with no explicit mutual constraint on top-down spatio-temporal coherency despite that it may be imposed to the segmentation objective. To handle such situations, we propose a multiscale saliency flow (MSF) model that jointly learns both foreground and background features of multiscale salient evidences, hence allowing temporally coherent top-down information in one frame to be propagated throughout the remaining frames. In particular, the top-down evidences are detected by combining saliency signature within a certain range of higher scales of approximation coefficients in wavelet domain. Saliency flow is then estimated by Gaussian kernel correlation of non-maximal suppressed multiscale evidences, which are characterized by HOG descriptors in a high-dimensional feature space. We build the proposed MSF model in accordance with the primary object hypothesis that jointly integrates temporal consistent constraints of saliency map estimated at multiple scales into the objective. We demonstrate the effectiveness of the proposed multiscale saliency flow for segmenting dynamic real-world scenes with large displacements caused by uniform sampling of video sequences.
Network virtualization (NV) provides a promising solution to overcome the resistance of the current Internet in aspects of architecture change, and virtual network embedding (VNE) has been recognized as a core component in NV. In this paper, the current advances in exploring model, methods and technologies for embedding the virtual network into the substrate network, are summarized. Furthermore, the future research trends are drawn. The main distinctive aspects of this survey with early ones include that it is mainly contributed to simplify the VNE problem on large networks, and that more recent publications in this field are introduced. In addition, the suggestions to the future investigation will concern some new terms of the VNE optimization.
Masafumi MAKINO Tatsuo TSUJI Ken HIGUCHI
In this paper, we present a new encoding/decoding method for dynamic multidimensional datasets and its implementation scheme. Our method encodes an n-dimensional tuple into a pair of scalar values even if n is sufficiently large. The method also encodes and decodes tuples using only shift and and/or register instructions. One of the most serious problems in multidimensional array based tuple encoding is that the size of an encoded result may often exceed the machine word size for large-scale tuple sets. This problem is efficiently resolved in our scheme. We confirmed the advantages of our scheme by analytical and experimental evaluations. The experimental evaluations were conducted to compare our constructed prototype system with other systems; (1) a system based on a similar encoding scheme called history-offset encoding, and (2) PostgreSQL RDBMS. In most cases, both the storage and retrieval costs of our system significantly outperformed those of the other systems.
Sahel SAHHAF Wouter TAVERNIER Didier COLLE Mario PICKAVET Piet DEMEESTER
The growth of the size of the routing tables limits the scalability of the conventional IP routing. As scalable routing schemes for large-scale networks are highly demanded, this paper proposes and evaluates an efficient geometric routing scheme and related low-cost node design applicable to large-scale networks. The approach guarantees that greedy forwarding on derived coordinates will result in successful packet delivery to every destination in the network by relying on coordinates deduced from a spanning tree of the network. The efficiency of the proposed scheme is measured in terms of routing quality (stretch) and size of the coordinates. The cost of the proposed router is quantified in terms of area complexity of the hardware design and all the evaluations involve comparison with a state-of-the-art approach with virtual coordinates in the hyperbolic plane. Extensive simulations assess the proposal in large topologies consisting of up to 100K nodes. Experiments show that the scheme has stretch properties comparable to geometric routing in the hyperbolic plane, while enabling a more efficient hardware design, and scaling considerably better in terms of storage requirements for coordinate representation. These attractive properties make the scheme promising for routing in large networks.
This work presents an approximate global optimization method for image halftone by fusing multi-scale information of the tree model. We employ Gaussian mixture model and hidden Markov tree to characterized the intra-scale clustering and inter-scale persistence properties of the detailed coefficients, respectively. The model of multiscale perceived error metric and the theory of scale-related perceived error metric are used to fuse the statistical distribution of the error metric of the scale of clustering and cross-scale persistence. An Energy function is then generated. Through energy minimization via graph cuts, we gain the halftone image. In the related experiment, we demonstrate the superior performance of this new algorithm when compared with several algorithms and quantitative evaluation.
Yusuke SAKUMOTO Masaki AIDA Hideyuki SHIMONISHI
In this paper, we propose a novel Autonomous Decentralized Control (ADC) scheme for indirectly controlling a system performance variable of large-scale and wide-area networks. In a large-scale and wide-area network, since it is impractical for any one node to gather full information of the entire network, network control must be realized by inter-node collaboration using information local to each node. Several critical network problems (e.g., resource allocation) are often formulated by a system performance variable that is an amount to quantify system state. We solve such problems by designing an autonomous node action that indirectly controls, via the Markov Chain Monte Carlo method, the probability distribution of a system performance variable by using only local information. Analyses based on statistical mechanics confirm the effectiveness of the proposed node action. Moreover, the proposal is used to implement traffic-aware virtual machine placement control with load balancing in a data center network. Simulations confirm that it can control the system performance variable and is robust against system fluctuations. A comparison against a centralized control scheme verifies the superiority of the proposal.
Norifumi KAWABATA Masaru MIYAO
Many previous studies on image quality assessment of 3D still images or video clips have been conducted. In particular, it is important to know the region in which assessors are interested or on which they focus in images or video clips, as represented by the ROI (Region of Interest). For multi-view 3D images, it is obvious that there are a number of viewpoints; however, it is not clear whether assessors focus on objects or background regions. It is also not clear on what assessors focus depending on whether the background region is colored or gray scale. Furthermore, while case studies on coded degradation in 2D or binocular stereoscopic videos have been conducted, no such case studies on multi-view 3D videos exist, and therefore, no results are available for coded degradation according to the object or background region in multi-view 3D images. In addition, in the case where the background region is gray scale or not, it was not revealed that there were affection for gaze point environment of assessors and subjective image quality. In this study, we conducted experiments on the subjective evaluation of the assessor in the case of coded degradation by JPEG coding of the background or object or both in 3D CG images using an eight viewpoint parallax barrier method. Then, we analyzed the results statistically and classified the evaluation scores using an SVM.
Young-Hoon KIM Jae-Hyun LEE Jung Yong LEE Seong-Cheol KIM
This paper deals with the small-scale fading distribution for UWB channels in the absence and presence of human bodies in indoor line-of-sight (LOS) environments and performance analysis of UWB systems considering the small-scale fading distribution. To obtain small-scale fading statistics, the channel measurements are performed in five representative environments that have different structure and size while locating the receiver (Rx) antenna on 49 (7×7 grid) local points with a fixed transmitter (Tx) antenna in each environment. The measured channel data are processed by a vector network analyzer and the target frequency bands range from 3 to 4.6GHz. From the measured data, we find the best fitted channel model among several typical theoretical distribution models such as Lognormal, Nakagami, and Weibull distributions, showing good agreement with the empirical channel data. We analyze the amplitude variation of the small-scale fading distribution in the absence and presence of human bodies. The results show that the small-scale fading statistics are best described by Weibull distribution and the two parameters of the distribution that determine the shape and the scale of the distribution depend on whether or not human bodies exist. We modeled and analyzed two parameters at different excess delays for all environments. Based on the measured small-scale fading distribution, this paper deals with the performance of UWB system using Rake receivers and also compares the performance with the existing channel model. The results suggest that the small-scale fading distribution in the absence and the presence of human bodies in indoor LOS environments should be considered when assessing the performance of UWB systems.
We present a new framework for embedding holographic halftone watermarking data into images by fusion of scale-related wavelet coefficients. The halftone watermarking image is obtained by using error-diffusion method and converted into Fresnel hologram, which is considered to be the initial password. After encryption, a scrambled watermarking image through Arnold transform is embedded into the host image during the halftoning process. We characterize the multi-scale representation of the original image using the discrete wavelet transform. The boundary information of the target image is fused by correlation of wavelet coefficients across wavelet transform layers to increase the pixel resolution scale. We apply the inter-scale fusion method to gain fusion coefficient of the fine-scale, which takes into account both the detail of the image and approximate information. Using the proposed method, the watermarking information can be embedded into the host image with recovery against the halftoning operation. The experimental results show that the proposed approach provides security and robustness against JPEG compression and different attacks compared to previous alternatives.
Inseong HWANG Seungwoo JEON Beobkeun CHO Yoonsik CHOE
This paper proposes a novel image classification scheme for cloth pattern recognition. The rotation and scale invariant delta-HOG (DHOG)-based descriptor and the entire recognition process using random ferns with this descriptor are proposed independent from pose and scale changes. These methods consider maximun orientation and various radii of a circular patch window for fast and efficient classification even when cloth patches are rotated and the scale is changed. It exhibits good performance in cloth pattern recognition experiments. It found a greater number of similar cloth patches than dense-SIFT in 20 tests out of a total of 36 query tests. In addition, the proposed method is much faster than dense-SIFT in both training and testing; its time consumption is decreased by 57.7% in training and 41.4% in testing. The proposed method, therefore, is expected to contribute to real-time cloth searching service applications that update vast numbers of cloth images posted on the Internet.
Marcus BARKOWSKY Enrico MASALA Glenn VAN WALLENDAEL Kjell BRUNNSTRÖM Nicolas STAELENS Patrick LE CALLET
The current development of video quality assessment algorithms suffers from the lack of available video sequences for training, verification and validation to determine and enhance the algorithm's application scope. The Joint Effort Group of the Video Quality Experts Group (VQEG-JEG) is currently driving efforts towards the creation of large scale, reproducible, and easy to use databases. These databases will contain bitstreams of recent video encoders (H.264, H.265), packet loss impairment patterns and impaired bitstreams, pre-parsed bitstream information into files in XML syntax, and well-known objective video quality measurement outputs. The database is continuously updated and enlarged using reproducible processing chains. Currently, more than 70,000 sequences are available for statistical analysis of video quality measurement algorithms. New research questions are posed as the database is designed to verify and validate models on a very large scale, testing and validating various scopes of applications, while subjective assessment has to be limited to a comparably small subset of the database. Special focus is given on the principles guiding the database development, and some results are given to illustrate the practical usefulness of such a database with respect to the detailed new research questions.
Roya E. REZAGAH Gia Khanh TRAN Kei SAKAGUCHI Kiyomichi ARAKI Satoshi KONISHI
In conventional wireless cellular networks, cell coverage is static and fixed, and each user equipment (UE) is connected to one or a few local base stations (BS). However, the users' distribution in the network area commonly fluctuates during a day. When there are congeries of users in some areas, conventional networks waste idle network resources in sparse areas. To address this issue, we propose a novel approach for cooperative cluster formation to dynamically transfer idle network resources from sparse cells to crowded cells or hotspots. In our proposed scheme, BS coverage is directed to hotspots by dynamically changing the antennas' beam angles, and forming large optimal cooperative clusters around hotspots. In this study, a cluster is a group of BSs that cooperatively perform joint transmission (JT) to several UEs. In this paper, a mathematical framework for calculation of the system rate of a cooperative cluster is developed. Next, the set of BSs for each cluster and the antennas' beam angles of each BS are optimized so that the system rate of the network is maximized. The trend of performance variation versus cluster size is studied and its limitations are determined. Numerical results using 3GPP specifications show that the proposed scheme attains several times higher capacity than conventional systems.