This letter proposes an image fusion method which adopts a union of multiple directional lapped orthogonal transforms (DirLOTs). DirLOTs are used to generate symmetric orthogonal discrete wavelet transforms and then to construct a union of unitary transforms as a redundant dictionary with a multiple directional property. The multiple DirLOTs can overcome a disadvantage of separable wavelets to represent images which contain slant textures and edges. We analyse the characteristic of local luminance contrast, and propose a fusion rule based on interscale relation of wavelet coefficients. Relying on the above, a novel image fusion method is proposed. Some experimental results show that the proposed method is able to significantly improve the fusion performance from those with the conventional discrete wavelet transforms.
Kenta KURIHARA Masanori KIKUCHI Shoko IMAIZUMI Sayaka SHIOTA Hitoshi KIYA
In many multimedia applications, image encryption has to be conducted prior to image compression. This paper proposes a JPEG-friendly perceptual encryption method, which enables to be conducted prior to JPEG and Motion JPEG compressions. The proposed encryption scheme can provides approximately the same compression performance as that of JPEG compression without any encryption, where both gray scale images and color ones are considered. It is also shown that the proposed scheme consists of four block-based encryption steps, and provide a reasonably high level of security. Most of conventional perceptual encryption schemes have not been designed for international compression standards, but this paper focuses on applying the JPEG and Motion JPEG standards, as one of the most widely used image compression standards. In addition, this paper considers an efficient key management scheme, which enables an encryption with multiple keys to be easy to manage its keys.
Kazunori KOMATANI Naoki HOTTA Satoshi SATO Mikio NAKANO
Appropriate turn-taking is important in spoken dialogue systems as well as generating correct responses. Especially if the dialogue features quick responses, a user utterance is often incorrectly segmented due to short pauses within it by voice activity detection (VAD). Incorrectly segmented utterances cause problems both in the automatic speech recognition (ASR) results and turn-taking: i.e., an incorrect VAD result leads to ASR errors and causes the system to start responding though the user is still speaking. We develop a method that performs a posteriori restoration for incorrectly segmented utterances and implement it as a plug-in for the MMDAgent open-source software. A crucial part of the method is to classify whether the restoration is required or not. We cast it as a binary classification problem of detecting originally single utterances from pairs of utterance fragments. Various features are used representing timing, prosody, and ASR result information. Experiments show that the proposed method outperformed a baseline with manually-selected features by 4.8% and 3.9% in cross-domain evaluations with two domains. More detailed analysis revealed that the dominant and domain-independent features were utterance intervals and results from the Gaussian mixture model (GMM).
Isosurface extraction is one of the most popular techniques for visualizing scalar volume data. However, volume data contains infinitely many isosurfaces. Furthermore, a single isosurface might contain many connected components, or contours, with each representing a different object surface. Hence, it is often a tedious and time-consuming manual process to find and extract contours that are interesting to users. This paper describes a novel method for automatically extracting salient contours from volume data. For this purpose, we propose a contour gradient tree (CGT) that contains the information of salient contours and their saliency magnitude. We organize the CGT in a hierarchical way to generate a sequence of contours in saliency order. Our method was applied to various medical datasets. Experimental results show that our method can automatically extract salient contours that represent regions of interest in the data.
Zhaofeng WU Guyu HU Fenglin JIN Yinjin FU Jianxin LUO Tingting ZHANG
The hop-limited adaptive routing (HLAR) mechanism and its enhancement (EHLAR), both tailored for the packet-switched non-geostationary (NGEO) satellite networks, are proposed and evaluated. The proposed routing mechanisms exploit both the predictable topology and inherent multi-path property of the NGEO satellite networks to adaptively distribute the traffic via all feasible neighboring satellites. Specifically, both mechanisms assume that a satellite can send the packets to their destinations via any feasible neighboring satellites, thus the link via the neighboring satellite to the destination satellite is assigned a probability that is proportional to the effective transmission to the destination satellites of the link. The satellite adjusts the link probability based on the packet sending information observed locally for the HLAR mechanism or exchanged between neighboring satellites for the EHLAR mechanism. Besides, the path of the packets are bounded by the maximum hop number, thus avoiding the unnecessary over-detoured packets in the satellite networks. The simulation results corroborate the improved performance of the proposed mechanisms compared with the existing in the literature.
In recent years, many variants of key point based image descriptors have been designed for the image matching, and they have achieved remarkable performances. However, to some images, local features appear to be inapplicable. Since theses images usually have many local changes around key points compared with a normal image, we define this special image category as the image with local changes (IL). An IL pair (ILP) refers to an image pair which contains a normal image and its IL. ILP usually loses local visual similarities between two images while still holding global visual similarity. When an IL is given as a query image, the purpose of this work is to match the corresponding ILP in a large scale image set. As a solution, we use a compressed HOG feature descriptor to extract global visual similarity. For the nearest neighbor search problem, we propose random projection indexed KD-tree forests (rKDFs) to match ILP efficiently instead of exhaustive linear search. rKDFs is built with large scale low-dimensional KD-trees. Each KD-tree is built in a random projection indexed subspace and contributes to the final result equally through a voting mechanism. We evaluated our method by a benchmark which contains 35,000 candidate images and 5,000 query images. The results show that our method is efficient for solving local-changes invariant image matching problems.
Maiko SAKAMOTO Hiromi YAMAGUCHI Toshimasa YAMAZAKI Ken-ichi KAMIJO Takahiro YAMANOI
We have proposed a new Bayesian network model (BNM) framework for single-trial-EEG-based Brain-Computer Interface (BCI). The BNM was constructed in the following. In order to discriminate between left and right hands to be imaged from single-trial EEGs measured during the movement imagery tasks, the BNM has the following three steps: (1) independent component analysis (ICA) for each of the single-trial EEGs; (2) equivalent current dipole source localization (ECDL) for projections of each IC on the scalp surface; (3) BNM construction using the ECDL results. The BNMs were composed of nodes and edges which correspond to the brain sites where ECDs are located, and their connections, respectively. The connections were quantified as node activities by conditional probabilities calculated by probabilistic inference in each trial. The BNM-based BCI is compared with the common spatial pattern (CSP) method. For ten healthy subjects, there was no significant difference between the two methods. Our BNM might reflect each subject's strategy for task execution.
Shohei KAMAMURA Hiroshi YAMAMOTO Kouichi GENDA Yuki KOIZUMI Shin'ichi ARAKAWA Masayuki MURATA
This paper proposes fast repairing methods that uses hierarchical software defined network controllers for recovering from massive failure in a large-scale IP over a wavelength-division multiplexing network. The network consists of multiple domains, and slave controllers are deployed in each domain. While each slave controller configures transport paths in its domain, the master controller manages end-to-end paths, which are established across multiple domains. For fast repair of intra-domain paths by the slave controllers, we define the optimization problem of path configuration order and propose a heuristic method, which minimizes the repair time to move from a disrupted state to a suboptimal state. For fast repair of end-to-end path through multiple domains, we also propose a network abstraction method, which efficiently manages the entire network. Evaluation results suggest that fast repair within a few minutes can be achieved by applying the proposed methods to the repairing scenario, where multiple links and nodes fail, in a 10,000-node network.
Dong-Hyun LIM Minook KIM Hyung-Min PARK
This letter presents a method for active noise cancelation (ANC) for headphone application. The method improves the performance of ANC by deriving a flexible independent component analysis (ICA) algorithm in a hybrid structure combining feedforward and feedback configurations with correlation-based wind detection. The effectiveness of the method is demonstrated through simulation.
Gentle AdaBoost is widely used in object detection and pattern recognition due to its efficiency and stability. To focus on instances with small margins, Gentle AdaBoost assigns larger weights to these instances during the training. However, misclassification of small-margin instances can still occur, which will cause the weights of these instances to become larger and larger. Eventually, several large-weight instances might dominate the whole data distribution, encouraging Gentle AdaBoost to choose weak hypotheses that fit only these instances in the late training phase. This phenomenon, known as “classifier distortion”, degrades the generalization error and can easily lead to overfitting since the deviation of all selected weak hypotheses is increased by the late-selected ones. To solve this problem, we propose a new variant which we call “Penalized AdaBoost”. In each iteration, our approach not only penalizes the misclassification of instances with small margins but also restrains the weight increase for instances with minimal margins. Our method performs better than Gentle AdaBoost because it avoids the “classifier distortion” effectively. Experiments show that our method achieves far lower generalization errors and a similar training speed compared with Gentle AdaBoost.
Xiao ZHAO Lifeng HE Bin YAO Yuyan CHAO
This paper presents a new connected component labeling algorithm. The proposed algorithm scans image lines every three lines and processes pixels three by three. When processing the current three pixels, we also utilize the information obtained before to reduce the repeated work for checking pixels in the mask. Experimental results demonstrated that our method is more efficient than the fastest conventional labeling algorithm.
Qi LIU Wei WANG Dong LIANG Xianpeng WANG
In this paper, a real-valued reweighted l1 norm minimization method based on data reconstruction in monostatic multiple-input multiple-output (MIMO) radar is proposed. Exploiting the special structure of the received data, and through the received data reconstruction approach and unitary transformation technique, a one-dimensional real-valued received data matrix can be obtained for recovering the sparse signal. Then a weight matrix based on real-valued MUSIC spectrum is designed for reweighting l1 norm minimization to enhance the sparsity of solution. Finally, the DOA can be estimated by finding the non-zero rows in the recovered matrix. Compared with traditional l1 norm-based minimization methods, the proposed method provides better angle estimation performance. Simulation results are presented to verify the effectiveness and advantage of the proposed method.
Chitapong WECHTAISONG Kazato IKEDA Hiroaki MORINO Takumi MIYOSHI
Most P2PTV systems select a neighbor peer in an overlay network using RTT or a random method without considering the underlying network. Streaming traffic is shared over a network without localization awareness, which is a serious problem for Internet Service Providers. In this paper, we present a novel scheme to achieve P2PTV traffic localization by inserting delay into P2P streaming packets, so that the length of the inserted delay depends on the AS hop distance between a peer and its neighbor peer. Experiments conducted on a real network show that our proposed scheme can perform efficient traffic localization.
Yuan LIANG Koji IWANO Koichi SHINODA
Most error correction interfaces for speech recognition applications on smartphones require the user to first mark an error region and choose the correct word from a candidate list. We propose a simple multimodal interface to make the process more efficient. We develop Long Context Match (LCM) to get candidates that complement the conventional word confusion network (WCN). Assuming that not only the preceding words but also the succeeding words of the error region are validated by users, we use such contexts to search higher-order n-grams corpora for matching word sequences. For this purpose, we also utilize the Web text data. Furthermore, we propose a combination of LCM and WCN (“LCM + WCN”) to provide users with candidate lists that are more relevant than those yielded by WCN alone. We compare our interface with the WCN-based interface on the Corpus of Spontaneous Japanese (CSJ). Our proposed “LCM + WCN” method improved the 1-best accuracy by 23%, improved the Mean Reciprocal Rank (MRR) by 28%, and our interface reduced the user's load by 12%.
Shoichiro KAWASHIMA Keizo MORITA Mitsuharu NAKAZAWA Kazuaki YAMANE Mitsuhiro OGAI Kuninori KAWABATA Kazuaki TAKAI Yasuhiro FUJII Ryoji YASUDA Wensheng WANG Yukinobu HIKOSAKA Ken'ichi INOUE
An 8-Mbit 0.18-µm CMOS 1T1C ferroelectric RAM (FeRAM) in a planar ferroelectric technology was developed. Even though the cell area of 2.48 µm2 is almost equal to that of a 4-Mbit stacked-capacitor FeRAM (STACK FeRAM) 2.32 µm2[1], the chip size of the developed 8-Mbit FeRAM, including extra 2-Mbit parities for the error correction code (ECC), is just 52.37 mm2, which is about 30% smaller than twice of the 4-Mbit STACK FeRAM device, 37.68mm2×2[1]. This excellent characteristic can be attributed to the large cell matrix architectures of the sectional cyclic word line (WL) that was used to increase the column numbers, and to the 1T1C bit-line GND level sensing (BGS)[2][3] circuit design intended to sense bit lines (BL) that have bit cells 1K long and a large capacitance. An access time of 52 ns and a cycle time of 77 ns in RT at a VDD of 1.8 V were achieved.
Hironori TAKIMOTO Tatsuhiko KOKUI Hitoshi YAMAUCHI Mitsuyoshi KISHIHARA Kensuke OKUBO
It is commonly believed that improved interaction between humans and electronic device, it is effective to draw the viewer's attention to a particular object. Augmented reality (AR) applications can call attention to real objects by overlaying highlight effects or visual stimuli (such as arrows) on a physical scene. Sometimes, more subtle effects would be desirable, in which case it would be necessary to smoothly and naturally guide the user's gaze without external stimuli. Here, a novel image modification method is proposed for directing a viewer's gaze to specific regions of interest. The proposed method uses saliency analysis and color modulation to create modified images in which the region of interest is the most salient region in the entire image. The proposed saliency map model that is used during saliency analysis reduces computational costs and improves the naturalness of the image using the LAB color space and simplified normalization. During color modulation, the modulation value of each LAB component is determined in order to consider the relationship between the LAB components and the saliency value. With the image obtained in this manner, the viewer's attention is smoothly attracted to a specific region very naturally. Gaze measurements as well as a subjective experiments were conducted to prove the effectiveness of the proposed method. These results show that a viewer's visual attention is indeed attracted toward the specified region without any sense of discomfort or disruption when the proposed method is used.
Point spread function (PSF) estimation plays a paramount role in image deblurring processing, and traditionally it is solved by parameter estimation of a certain preassumed PSF shape model. In real life, the PSF shape is generally arbitrary and complicated, and thus it is assumed in this manuscript that a PSF may be decomposed as a weighted sum of a certain number of Gaussian kernels, with weight coefficients estimated in an alternating manner, and an l1 norm-based total variation (TVl1) algorithm is adopted to recover the latent image. Experiments show that the proposed method can achieve satisfactory performance on synthetic and realistic blurred images.
When there are multiple component predictors, it is promising to integrate them into one predictor for advanced reasoning. If each component predictor is given as a stochastic model in the form of probability distribution, an exponential mixture of the component probability distributions provides a good way to integrate them. However, weight parameters used in the exponential mixture model are difficult to estimate if there is no training samples for performance evaluation. As a suboptimal way to solve this problem, weight parameters may be estimated so that the exponential mixture model should be a balance point that is defined as an equilibrium point with respect to the distance from/to all component probability distributions. In this paper, we propose a weight parameter estimation method that represents this concept using a symmetric Kullback-Leibler divergence and generalize this method.
Wei LIU Rui HU Ryoichi SHINKUMA Tatsuro TAKAHASHI
Mobile virtual network operators (MVNOs) are mobile operators without their own infrastructure or government issued spectrum licenses. They purchase spectrum resources from primary mobile network operators (MNOs) to provide communication services under their own brands. MVNOs are expected to play an important role in mobile network markets, as this will increase the competition in retail markets and help to meet the demand of niche markets. However, with the rapidly increasing demand of mobile data traffic, efficient utilization of the limited spectrum resources owned by MVNOs has become an important issue. We propose here a resource sharing mechanism between MVNOs against the background of network functions virtualization (NFV). The proposed mechanism enables MVNOs to improve their quality of service (QoS) by sharing spectrum resources with each other. A nash bargaining solution based decision strategy is also devised to ensure the fairness of resource sharing. Extensive numerical evaluation results validate the effectiveness of the proposed models and mechanisms.
This paper analyzes the impact of directional antennas in improving the transmission capacity, defined as the maximum allowable spatial node density of successful transmissions multiplied by their data rate with a given outage constraint, in wireless networks. We consider the case where the gain Gm for the mainlobe of beamwidth can scale at an arbitrarily large rate. Under the beamwidth scaling model, the transmission capacity is analyzed for all path-loss attenuation regimes for the following two network configurations. In dense networks, in which the spatial node density increases with the antenna gain Gm, the transmission capacity scales as Gm4/α, where α denotes the path-loss exponent. On the other hand, in extended networks of fixed node density, the transmission capacity scales logarithmically in Gm. For comparison, we also show an ideal antenna model where there is no sidelobe beam. In addition, computer simulations are performed, which show trends consistent with our analytical behaviors. Our analysis sheds light on a new understanding of the fundamental limit of outage-constrained ad hoc networks operating in the directional mode.