In this letter, an effective low bit-rate image restoration method is proposed, in which image denoising and subspace regression learning are combined. The proposed framework has two parts: image main structure estimation by classical NLM denoising and texture component prediction by subspace joint regression learning. The local regression function are learned from denoised patch to original patch in each subspace, where the corresponding compression image patches are employed to generate anchoring points by the dictionary learning approach. Moreover, we extent Extreme Support Vector Regression (ESVR) as multi-variable nonlinear regression to get more robustness results. Experimental results demonstrate the proposed method achieves favorable performance compared with other leading methods.
Xing CHEN Tianshuang QIU Cheng LIU Jitong MA
This paper mainly discusses the time-difference-of-arrival (TDOA) estimation problem of digital modulation signal under impulsive noise and cochannel interference environment. Since the conventional TDOA estimation algorithms based on the second-order cyclic statistics degenerate severely in impulsive noise and the TDOA estimation algorithms based on correntropy are out of work in cochannel interference, a novel signal-selective algorithm based on the generalized cyclic correntropy is proposed, which can suppress both impulsive noise and cochannel interference. Theoretical derivation and simulation results demonstrate the effectiveness and robustness of the proposed algorithm.
Yoshinao MIZUGAKI Hiroshi SHIMADA Ayumi HIRANO-IWATA Fumihiko HIROSE
We numerically simulated electrical properties, i.e., the resistance and Coulomb blockade threshold, of randomly-placed conductive nanoparticles. In simulation, tunnel junctions were assumed to be formed between neighboring particle-particle and particle-electrode connections. On a plane of triangle 100×100 grids, three electrodes, the drain, source, and gate, were defined. After random placements of conductive particles, the connection between the drain and source electrodes were evaluated with keeping the gate electrode disconnected. The resistance was obtained by use of a SPICE-like simulator, whereas the Coulomb blockade threshold was determined from the current-voltage characteristics simulated using a Monte-Carlo simulator. Strong linear correlation between the resistance and threshold voltage was confirmed, which agreed with results for uniform one-dimensional arrays.
Xiaobo ZHANG Wenbo XU Yupeng CUI Jiaru LIN
In compressed sensing, most previous researches have studied the recovery performance of a sparse signal x based on the acquired model y=Φx+n, where n denotes the noise vector. There are also related studies for general perturbation environment, i.e., y=(Φ+E)x+n, where E is the measurement perturbation. IHT and HTP algorithms are the classical algorithms for sparse signal reconstruction in compressed sensing. Under the general perturbations, this paper derive the required sufficient conditions and the error bounds of IHT and HTP algorithms.
Mayu OTANI Atsushi NISHIDA Yuta NAKASHIMA Tomokazu SATO Naokazu YOKOYA
Finding important regions is essential for applications, such as content-aware video compression and video retargeting to automatically crop a region in a video for small screens. Since people are one of main subjects when taking a video, some methods for finding important regions use a visual attention model based on face/pedestrian detection to incorporate the knowledge that people are important. However, such methods usually do not distinguish important people from passers-by and bystanders, which results in false positives. In this paper, we propose a deep neural network (DNN)-based method, which classifies a person into important or unimportant, given a video containing multiple people in a single frame and captured with a hand-held camera. Intuitively, important/unimportant labels are highly correlated given that corresponding people's spatial motions are similar. Based on this assumption, we propose to boost the performance of our important/unimportant classification by using conditional random fields (CRFs) built upon the DNN, which can be trained in an end-to-end manner. Our experimental results show that our method successfully classifies important people and the use of a DNN with CRFs improves the accuracy.
Yusuke YAGI Keita TAKAHASHI Toshiaki FUJII Toshiki SONODA Hajime NAGAHARA
A light field, which is often understood as a set of dense multi-view images, has been utilized in various 2D/3D applications. Efficient light field acquisition using a coded aperture camera is the target problem considered in this paper. Specifically, the entire light field, which consists of many images, should be reconstructed from only a few images that are captured through different aperture patterns. In previous work, this problem has often been discussed from the context of compressed sensing (CS), where sparse representations on a pre-trained dictionary or basis are explored to reconstruct the light field. In contrast, we formulated this problem from the perspective of principal component analysis (PCA) and non-negative matrix factorization (NMF), where only a small number of basis vectors are selected in advance based on the analysis of the training dataset. From this formulation, we derived optimal non-negative aperture patterns and a straight-forward reconstruction algorithm. Even though our method is based on conventional techniques, it has proven to be more accurate and much faster than a state-of-the-art CS-based method.
Phuc V. TRINH Thanh V. PHAM Anh T. PHAM
Both spatial diversity and multihop relaying are considered to be effective methods for mitigating the impact of atmospheric turbulence-induced fading on the performance of free-space optical (FSO) systems. Multihop relaying can significantly reduce the impact of fading by relaying the information over a number of shorter hops. However, it is not feasible or economical to deploy relays in many practical scenarios. Spatial diversity could substantially reduce the fading variance by introducing additional degrees of freedom in the spatial domain. Nevertheless, its superiority is diminished when the fading sub-channels are correlated. In this paper, our aim is to study the fundamental performance limits of spatial diversity suffering from correlated Gamma-Gamma (G-G) fading channels in multihop coherent FSO systems. For the performance analysis, we propose to approximate the sum of correlated G-G random variables (RVs) as a G-G RV, which is then verified by the Kolmogorov-Smirnov (KS) goodness-of-fit statistical test. Performance metrics, including the outage probability and the ergodic capacity, are newly derived in closed-form expressions and thoroughly investigated. Monte-Carlo (M-C) simulations are also performed to validate the analytical results.
Hidefumi HIRAISHI Hiroshi IMAI Yoichi IWATA Bingkai LIN
Computing the partition function of the Ising model on a graph has been investigated from both sides of computer science and statistical physics, with producing fertile results of P cases, FPTAS/FPRAS cases, inapproximability and intractability. Recently, measurement-based quantum computing as well as quantum annealing open up another bridge between two fields by relating a tree tensor network representing a quantum graph state to a rank decomposition of the graph. This paper makes this bridge wider in both directions. An $O^*(2^{ rac{omega}{2} bw(G)})$-time algorithm is developed for the partition function on n-vertex graph G with branch decomposition of width bw(G), where O* ignores a polynomial factor in n and ω is the matrix multiplication parameter less than 2.37287. Related algorithms of $O^*(4^{rw( ilde{G})})$ time for the tree tensor network are given which are of interest in quantum computation, given rank decomposition of a subdivided graph $ ilde{G}$ with width $rw( ilde{G})$. These algorithms are parameter-exponential, i.e., O*(cp) for constant c and parameter p, and such an algorithm is not known for a more general case of computing the Tutte polynomial in terms of bw(G) (the current best time is O*(min{2n, bw(G)O(bw(G))})) with a negative result in terms of the clique-width, related to the rank-width, under ETH.
An important problem in mathematics and data science, given two or more metric spaces, is obtaining a metric of the product space by aggregating the source metrics using a multivariate function. In 1981, Borsík and Doboš solved the problem, and much progress has subsequently been made in generalizations of the problem. The triangle inequality is a key property for a bivariate function to be a metric. In the metric aggregation, requesting the triangle inequality of the resulting metric imposes the subadditivity on the aggregating function. However, in some applications, such as the image matching, a relaxed notion of the triangle inequality is useful and this relaxation may enlarge the scope of the aggregators to include some natural superadditive functions such as the harmonic mean. This paper examines the aggregation of two semimetrics (i.e. metrics with a relaxed triangle inequality) by the harmonic mean is studied and shows that such aggregation weakly preserves the relaxed triangle inequalities. As an application, the paper presents an alternative simple proof of the relaxed triangle inequality satisfied by the robust Jaccard-Tanimoto set dissimilarity, which was originally shown by Gragera and Suppakitpaisarn in 2016.
V2V broadcast communication is not only promising for safety driving assistance but also enhancing automated driving ability by sharing information of vehicle moving behavior with other vehicles. However, an important issue is how to reduce information delivery delay and achieve dependable communication that is essential for automated vehicle control by machine. Since radio propagation often exhibits fading and shadowing on the road, V2V packet error happens probabilistically. Although repeated transmission method can enhance reliability of broadcast transmission, information delivery delay significantly increases as packet reception rate decreases. In order to reduce the delay, a relay-assisted broadcast transmission scheme is employed in this paper. The scheme can improve packet reception rate by path diversity and remarkably reduce average delivery delay due to repeated transmission. Performance with roadside relay stations considering urban environment with multiple intersections is evaluated through large-scale network simulation. The obtained results show that the average delivery delay is remarkably reduced by the relay-assist scheme to less than 20ms, which is less than a quarter of the direct V2V communication.
Yu NAKAHATA Jun KAWAHARA Takashi HORIYAMA Shoji KASAHARA
This paper studies a variant of the graph partitioning problem, called the evacuation planning problem, which asks us to partition a target area, represented by a graph, into several regions so that each region contains exactly one shelter. Each region must be convex to reduce intersections of evacuation routes, the distance between each point to a shelter must be bounded so that inhabitants can quickly evacuate from a disaster, and the number of inhabitants assigned to each shelter must not exceed the capacity of the shelter. This paper formulates the convexity of connected components as a spanning shortest path forest for general graphs, and proposes a novel algorithm to tackle this multi-objective optimization problem. The algorithm not only obtains a single partition but also enumerates all partitions simultaneously satisfying the above complex constraints, which is difficult to be treated by existing algorithms, using zero-suppressed binary decision diagrams (ZDDs) as a compressed expression. The efficiency of the proposed algorithm is confirmed by the experiments using real-world map data. The results of the experiments show that the proposed algorithm can obtain hundreds of millions of partitions satisfying all the constraints for input graphs with a hundred of edges in a few minutes.
Su LIU Xingguang GENG Yitao ZHANG Shaolong ZHANG Jun ZHANG Yanbin XIAO Chengjun HUANG Haiying ZHANG
The quality of edge detection is related to detection angle, scale, and threshold. There have been many algorithms to promote edge detection quality by some rules about detection angles. However these algorithm did not form rules to detect edges at an arbitrary angle, therefore they just used different number of angles and did not indicate optimized number of angles. In this paper, a novel edge detection algorithm is proposed to detect edges at arbitrary angles and optimized number of angles in the algorithm is introduced. The algorithm combines singularity detection with Gaussian wavelet transform and edge detection at arbitrary directions and contain five steps: 1) An image is divided into some pixel lines at certain angle in the range from 45° to 90° according to decomposition rules of this paper. 2) Singularities of pixel lines are detected and form an edge image at the certain angle. 3) Many edge images at different angles form a final edge images. 4) Detection angles in the range from 45° to 90° are extended to range from 0° to 360°. 5) Optimized number of angles for the algorithm is proposed. Then the algorithm with optimized number of angles shows better performances.
Koichi MITSUNARI Jaehoon YU Takao ONOYE Masanori HASHIMOTO
Visual object detection on embedded systems involves a multi-objective optimization problem in the presence of trade-offs between power consumption, processing performance, and detection accuracy. For a new Pareto solution with high processing performance and low power consumption, this paper proposes a hardware architecture for decision tree ensemble using multiple channels of features. For efficient detection, the proposed architecture utilizes the dimensionality of feature channels in addition to parallelism in image space and adopts task scheduling to attain random memory access without conflict. Evaluation results show that an FPGA implementation of the proposed architecture with an aggregated channel features pedestrian detector can process 229 million samples per second at 100MHz operation frequency while it requires a relatively small amount of resources. Consequently, the proposed architecture achieves 350fps processing performance for 1080P Full HD images and outperforms conventional object detection hardware architectures developed for embedded systems.
Yizhe WANG Yongshun ZHANG Sisan HE Yi RAO
Precession angle and precession period are significant parameters for identifying space micro-motion targets. To implement high-accuracy estimation of precession parameters without any prior knowledge about structure parameters of the target, a parameters extraction method based on HRRP sequences is proposed. The precession model of cone-shaped targets is established and analyzed firstly. Then the projection position of scattering centers on HRRP induced by precession is indicated to be approximate sinusoidal migration. Sequences of scattering centers are associated by sinusoid extraction algorithm. Precession angle and precession period are estimated utilizing error function optimization at last. Simulation results under various SNR levels based on electromagnetic calculation data demonstrate validity of the proposed method.
Xiang ZHAO Zishu HE Yikai WANG Yuan JIANG
This letter addresses the problem of space-time adaptive processing (STAP) for airborne nonuniform linear array (NLA) radar using a generalized sidelobe canceller (GSC). Due to the difficulty of determining the spatial nulls for the NLAs, it is a problem to obtain a valid blocking matrix (BM) of the GSC directly. In order to solve this problem and improve the STAP performance, a BM modification method based on the modified Gram-Schmidt orthogonalization algorithm is proposed. The modified GSC processor can achieve the optimal STAP performance and as well a faster convergence rate than the orthogonal subspace projection method. Numerical simulations validate the effectiveness of the proposed methods.
Kazuyuki MORIOKA Satoshi YAMAZAKI David ASANO
We consider space time block coded-continuous phase modulation (STBC-CPM), which has the advantages of both STBC and CPM at the same time. A weak point of STBC-CPM is that the normalized spectral efficiency (NSE) is limited by the orthogonality of the STBC and CPM parameters. The purpose of this study is to improve the NSE of STBC-CPM. The NSE depends on the transmission rate (TR), the bit error rate (BER) and the occupied bandwidth (OBW). First, to improve the TR, we adapt quasi orthogonal-STBC (QO-STBC) for four transmit antennas and quasi-group orthogonal Toeplitz code (Q-GOTC) for eight transmit antennas, at the expense of the orthogonality. Second, to evaluate the BER, we derive a BER approximation of STBC-CPM with non-orthogonal STBC (NO-STBC). The theoretical analysis and simulation results show that the NSE can be improved by using QO-STBC and Q-GOTC. Third, the OBW depends on CPM parameters, therefore, the tradeoff between the NSE and the CPM parameters is considered. A computer simulation provides a candidate set of CPM parameters which have better NSE. Finally, the adaptation of non-orthogonal STBC to STBC-CPM can be viewed as a generalization of the study by Silvester et al., because orthogonal STBC can be thought of as a special case of non-orthogonal STBC. Also, the adaptation of Q-GOTC to CPM can be viewed as a generalization of our previous letter, because linear modulation scheme can be thought of as a special case of non-linear modulation.
Minsu KIM Kunwoo LEE Katsuhiko GONDOW Jun-ichi IMURA
The main purpose of Codemark is to distribute digital contents using offline media. Due to the main purpose of Codemark, Codemark cannot be used on digital images. It has high robustness on only printed images. This paper presents a new color code called Robust Index Code (RIC for short), which has high robustness on JPEG Compression and Resize targeting digital images. RIC embeds a remote database index to digital images so that users can reach to any digital contents. Experimental results, using our implemented RIC encoder and decoder, have shown high robustness on JPEG Comp. and Resize of the proposed codemark. The embedded database indexes can be extracted 100% on compressed images to 30%. In conclusion, it is able to store all the type of digital products by embedding indexes into digital images to access database, which means it makes a Superdistribution system with digital images realized. Therefore RIC has the potential for new Internet image services, since all the images encoded by RIC are possible to access original products anywhere.
Yu CHEN Jing XIAO Liuyi HU Dan CHEN Zhongyuan WANG Dengshi LI
Saliency detection for videos has been paid great attention and extensively studied in recent years. However, various visual scene with complicated motions leads to noticeable background noise and non-uniformly highlighting the foreground objects. In this paper, we proposed a video saliency detection model using spatio-temporal cues. In spatial domain, the location of foreground region is utilized as spatial cue to constrain the accumulation of contrast for background regions. In temporal domain, the spatial distribution of motion-similar regions is adopted as temporal cue to further suppress the background noise. Moreover, a backward matching based temporal prediction method is developed to adjust the temporal saliency according to its corresponding prediction from the previous frame, thus enforcing the consistency along time axis. The performance evaluation on several popular benchmark data sets validates that our approach outperforms existing state-of-the-arts.
Wireless power transfer (WPT) via coupled magnetic resonances has more than ten years history of development. However, it appears frequency splitting phenomenon in the over-coupled region, thus, the output power of the two-coil WPT system achieves the maximum output power at the two splitting angular frequencies and not at the natural resonant angular frequency. By investigating the relationship between the impedances of the transmitter side and receiver side, we found that WPT system is a power superposition system, and the reasons were given to explaining how to appear the frequency splitting and impact on the maximum output power of the system in details. First, the circuit model was established and transfer characteristics of the two-coil WPT system were studied by utilizing circuit theories. Second, the mechanism of the power superposition of the WPT system was carefully researched. Third, the relationship between the impedances of the transmitter side and receiver side was obtained by investigating the impedance characteristics of a two-coil WPT system, and also the impact factors of the maximum output power of the system were obtained by using a power superposition mechanism. Finally, the experimental circuit was designed and experimental results are well consistent with the theoretical analysis.
In recent years, deep learning based approaches have substantially improved the performance of face recognition. Most existing deep learning techniques work well, but neglect effective utilization of face correlation information. The resulting performance loss is noteworthy for personal appearance variations caused by factors such as illumination, pose, occlusion, and misalignment. We believe that face correlation information should be introduced to solve this network performance problem originating from by intra-personal variations. Recently, graph deep learning approaches have emerged for representing structured graph data. A graph is a powerful tool for representing complex information of the face image. In this paper, we survey the recent research related to the graph structure of Convolutional Neural Networks and try to devise a definition of graph structure included in Compressed Sensing and Deep Learning. This paper devoted to the story explain of two properties of our graph - sparse and depth. Sparse can be advantageous since features are more likely to be linearly separable and they are more robust. The depth means that this is a multi-resolution multi-channel learning process. We think that sparse graph based deep neural network can more effectively make similar objects to attract each other, the relative, different objects mutually exclusive, similar to a better sparse multi-resolution clustering. Based on this concept, we propose a sparse graph representation based on the face correlation information that is embedded via the sparse reconstruction and deep learning within an irregular domain. The resulting classification is remarkably robust. The proposed method achieves high recognition rates of 99.61% (94.67%) on the benchmark LFW (YTF) facial evaluation database.