Hitoshi SUDA Gaku KOTANI Daisuke SAITO
In this paper, we propose a new training framework named the INmfCA algorithm for nonparallel voice conversion (VC) systems. To train conversion models, traditional VC frameworks require parallel corpora, in which source and target speakers utter the same linguistic contents. Although the frameworks have achieved high-quality VC, they are not applicable in situations where parallel corpora are unavailable. To acquire conversion models without parallel corpora, nonparallel methods are widely studied. Although the frameworks achieve VC under nonparallel conditions, they tend to require huge background knowledge or many training utterances. This is because of difficulty in disentangling linguistic and speaker information without a large amount of data. In this work, we tackle this problem by exploiting NMF, which can factorize acoustic features into time-variant and time-invariant components in an unsupervised manner. The method acquires alignment between the acoustic features of a source speaker's utterances and a target dictionary and uses the obtained alignment as activation of NMF to train the source speaker's dictionary without parallel corpora. The acquisition method is based on the INCA algorithm, which obtains the alignment of nonparallel corpora. In contrast to the INCA algorithm, the alignment is not restricted to observed samples, and thus the proposed method can efficiently utilize small nonparallel corpora. The results of subjective experiments show that the combination of the proposed algorithm and the INCA algorithm outperformed not only an INCA-based nonparallel framework but also CycleGAN-VC, which performs nonparallel VC without any additional training data. The results also indicate that a one-shot VC framework, which does not need to train source speakers, can be constructed on the basis of the proposed method.
Masahiro MURAYAMA Toyohiro HIGASHIYAMA Yuki HARAZONO Hirotake ISHII Hiroshi SHIMODA Shinobu OKIDO Yasuyoshi TARUTA
High-quality depth images are required for stable and accurate computer vision. Depth images captured by depth cameras tend to be noisy, incomplete, and of low-resolution. Therefore, increasing the accuracy and resolution of depth images is desirable. We propose a method for reducing the noise and holes from depth images pixel by pixel, and increasing resolution. For each pixel in the target image, the linear space from the focal point of the camera through each pixel to the existing object is divided into equally spaced grids. In each grid, the difference from each grid to the object surface is obtained from multiple tracked depth images, which have noisy depth values of the respective image pixels. Then, the coordinates of the correct object surface are obtainable by reducing the depth random noise. The missing values are completed. The resolution can also be increased by creating new pixels between existing pixels and by then using the same process as that used for noise reduction. Evaluation results have demonstrated that the proposed method can do processing with less GPU memory. Furthermore, the proposed method was able to reduce noise more accurately, especially around edges, and was able to process more details of objects than the conventional method. The super-resolution of the proposed method also produced a high-resolution depth image with smoother and more accurate edges than the conventional methods.
Xin ZENG Lin ZHANG Zhongqiang LUO Xingzhong XIONG Chengjie LI
In recent years, the development of visual tracking is getting better and better, but some methods cannot overcome the problem of low accuracy and success rate of tracking. Although there are some trackers will be more accurate, they will cost more time. In order to solve the problem, we propose a reinforced tracker based on Hierarchical Convolutional Features (HCF for short). HOG, color-naming and grayscale features are used with different weights to supplement the convolution features, which can enhance the tracking robustness. At the same time, we improved the model update strategy to save the time costs. This tracker is called RHCF and the code is published on https://github.com/z15846/RHCF. Experiments on the OTB2013 dataset show that our tracker can validly achieve the promotion of the accuracy and success rate.
Zhi WENG Longzhen FAN Yong ZHANG Zhiqiang ZHENG Caili GONG Zhongyue WEI
As the basis of fine breeding management and animal husbandry insurance, individual recognition of dairy cattle is an important issue in the animal husbandry management field. Due to the limitations of the traditional method of cow identification, such as being easy to drop and falsify, it can no longer meet the needs of modern intelligent pasture management. In recent years, with the rise of computer vision technology, deep learning has developed rapidly in the field of face recognition. The recognition accuracy has surpassed the level of human face recognition and has been widely used in the production environment. However, research on the facial recognition of large livestock, such as dairy cattle, needs to be developed and improved. According to the idea of a residual network, an improved convolutional neural network (Res_5_2Net) method for individual dairy cow recognition is proposed based on dairy cow facial images in this letter. The recognition accuracy on our self-built cow face database (3012 training sets, 1536 test sets) can reach 94.53%. The experimental results show that the efficiency of identification of dairy cows is effectively improved.
Kazumoto TANAKA Yunchuan ZHANG
We propose an augmented-reality-based method for arranging furniture using natural markers extracted from the edges of the walls of rooms. The proposed method extracts natural markers and estimates the camera parameters from single images of rooms using deep neural networks. Experimental results show that in all the measurements, the superimposition error of the proposed method was lower than that of general marker-based methods that use practical-sized markers.
Tomoyuki TANAKA Christopher L. AYALA Nobuyuki YOSHIKAWA
Extremely energy-efficient logic devices are required for future low-power high-performance computing systems. Superconductor electronic technology has a number of energy-efficient logic families. Among them is the adiabatic quantum-flux-parametron (AQFP) logic family, which adiabatically switches the quantum-flux-parametron (QFP) circuit when it is excited by an AC power-clock. When compared to state-of-the-art CMOS technology, AQFP logic circuits have the advantage of relatively fast clock rates (5 GHz to 10 GHz) and 5 - 6 orders of magnitude reduction in energy before cooling overhead. We have been developing extremely energy-efficient computing processor components using the AQFP. The adder is the most basic computational unit and is important in the development of a processor. In this work, we designed and measured a 16-bit parallel prefix carry look-ahead Kogge-Stone adder (KSA). We fabricated the circuit using the AIST 10 kA/cm2 High-speed STandard Process (HSTP). Due to a malfunction in the measurement system, we were not able to confirm the complete operation of the circuit at the low frequency of 100 kHz in liquid He, but we confirmed that the outputs that we did observe are correct for two types of tests: (1) critical tests and (2) 110 random input tests in total. The operation margin of the circuit is wide, and we did not observe any calculation errors during measurement.
Shanqi PANG Xiankui PENG Xiao ZHANG Ruining ZHANG Cuijiao YIN
Quantum combinatorial designs are gaining popularity in quantum information theory. Quantum Latin squares can be used to construct mutually unbiased maximally entangled bases and unitary error bases. Here we present a general method for constructing quantum Latin arrangements from irredundant orthogonal arrays. As an application of the method, many new quantum Latin arrangements are obtained. We also find a sufficient condition such that the improved quantum orthogonal arrays [10] are equivalent to quantum Latin arrangements. We further prove that an improved quantum orthogonal array can produce a quantum uniform state.
Takumi NISHIME Hiroshi HASHIGUCHI Naobumi MICHISHITA Hisashi MORISHITA
Platform-mounted small antennas increase dielectric loss and conductive loss and decrease the radiation efficiency. This paper proposes a novel antenna design method to improve radiation efficiency for platform-mounted small antennas by characteristic mode analysis. The proposed method uses mapping of modal weighting coefficient (MWC) and infinitesimal dipole and evaluate the metal casing with 100mm × 55mm × 23mm as a platform excited by an inverted-F antenna. The simulation and measurement results show that the radiation efficiency of 5% is improved with the whole system from 2.5% of the single antenna.
Chi-Min LI Dong-Lin LU Pao-Jen WANG
Currently, as the widespread usage of the smart devices in our daily life, the demands of high data rate and low latency services become important issues to facilitate various applications. However, high data rate service usually implies large bandwidth requirement. To solve the problem of bandwidth shortage below 6GHz (sub-6G), future wireless communications can be up-converted to the millimeter-wave (mm-wave) bands. Nevertheless, mm-wave frequency bands suffer from high channel attenuation and serious penetration loss compared with sub-6G frequency bands, and the signal transmission in the indoor environment will furthermore be affected by various partition materials, such as concrete, wood, glass, etc. Therefore, the fifth-generation (5G) mobile communication system may use multiple small cells (SC) to overcome the signal attenuation caused by using mm-wave bands. This paper will analyze the attenuation characteristics of some common partition materials in indoor environments. Besides, the performances, such as the received signal power, signal to interference plus noise ratio (SINR) and system capacity for different SC deployments are simulated and analyzed to provide the suitable guideline for each SC deployments.
Kyogo OTA Daisuke INOUE Mamoru SAWAHASHI Satoshi NAGATA
This paper proposes individual computation processes of the partial demodulation reference signal (DM-RS) sequence in a synchronization signal (SS)/physical broadcast channel (PBCH) block to be used to detect the radio frame timing based on SS/PBCH block index detection for New Radio (NR) initial access. We present the radio frame timing detection probability using the proposed partial DM-RS sequence detection method that is applied subsequent to the physical-layer cell identity (PCID) detection in five tapped delay line (TDL) models in both non-line-of-sight (NLOS) and line-of-sight (LOS) environments. Computer simulation results show that by using the proposed method, the radio frame timing detection probabilities of almost 100% and higher than 90% are achieved for the LOS and NLOS channel models, respectively, at the average received signal-to-noise power ratio (SNR) of 0dB with the frequency stability of a local oscillator in a set of user equipment (UE) of 5ppm at the carrier frequency of 4GHz.
Stanislav SEDUKHIN Yoichi TOMIOKA Kohei YAMAMOTO
In this paper, starting from the algorithm, a performance- and energy-efficient 3D structure or shape of the Tensor Processing Engine (TPE) for CNN acceleration is systematically searched and evaluated. An optimal accelerator's shape maximizes the number of concurrent MAC operations per clock cycle while minimizes the number of redundant operations. The proposed 3D vector-parallel TPE architecture with an optimal shape can be very efficiently used for considerable CNN acceleration. Due to implemented support of inter-block image data independency, it is possible to use multiple of such TPEs for the additional CNN acceleration. Moreover, it is shown that the proposed TPE can also be uniformly used for acceleration of the different CNN models such as VGG, ResNet, YOLO, and SSD. We also demonstrate that our theoretical efficiency analysis is matched with the result of a real implementation for an SSD model to which a state-of-the-art channel pruning technique is applied.
Yuyao LIU Shi BAO Go TANAKA Yujun LIU Dongsheng XU
When collecting images, owing to the influence of shooting equipment, shooting environment, and other factors, often low-illumination images with insufficient exposure are obtained. For low-illumination images, it is necessary to improve the contrast. In this paper, a digital color image contrast enhancement method based on luminance weight adjustment is proposed. This method improves the contrast of the image and maintains the detail and nature of the image. In the proposed method, the illumination of the histogram equalization image and the adaptive gamma correction with weighted distribution image are adjusted by the luminance weight of w1 to obtain a detailed image of the bright areas. Thereafter, the suppressed multi-scale retinex (MSR) is used to process the input image and obtain a detailed image of the dark areas. Finally, the luminance weight w2 is used to adjust the illumination component of the detailed images of the bright and dark areas, respectively, to obtain the output image. The experimental results show that the proposed method can enhance the details of the input image and avoid excessive enhancement of contrast, which maintains the naturalness of the input image well. Furthermore, we used the discrete entropy and lightness order error function to perform a numerical evaluation to verify the effectiveness of the proposed method.
Shuhei TAMATE Yutaka TABUCHI Yasunobu NAKAMURA
In this paper, we review the basic components of superconducting quantum computers. We mainly focus on the packaging and wiring technologies required to realize large-scalable superconducting quantum computers.
The objective of critical nodes problem is to minimize pair-wise connectivity as a result of removing a specific number of nodes in the residual graph. From a mathematical modeling perspective, it comes the truth that the more the number of fragmented components and the evenly distributed of disconnected sub-graphs, the better the quality of the solution. Basing on this conclusion, we proposed a new Cluster Expansion Method for Critical Node Problem (CEMCNP), which on the one hand exploits a contraction mechanism to greedy simplify the complexity of sparse graph model, and on the other hand adopts an incremental cluster expansion approach in order to maintain the size of formed component within reasonable limitation. The proposed algorithm also relies heavily on the idea of multi-start iterative local search algorithm, whereas brings in a diversified late acceptance local search strategy to keep the balance between interleaving diversification and intensification in the process of neighborhood search. Extensive evaluations show that CEMCNP running on 35 of total 42 benchmark instances are superior to the outcome of KBV, while holding 3 previous best results out of the challenging instances. In addition, CEMCNP also demonstrates equivalent performance in comparison with the existing MANCNP and VPMS algorithms over 22 of total 42 graph models with fewer number of node exchange operations.
In this study, we aim to improve the performance of audio source separation for monaural mixture signals. For monaural audio source separation, semisupervised nonnegative matrix factorization (SNMF) can achieve higher separation performance by employing small supervised signals. In particular, penalized SNMF (PSNMF) with orthogonality penalty is an effective method. PSNMF forces two basis matrices for target and nontarget sources to be orthogonal to each other and improves the separation accuracy. However, the conventional orthogonality penalty is based on an inner product and does not affect the estimation of the basis matrix properly because of the scale indeterminacy between the basis and activation matrices in NMF. To cope with this problem, a new PSNMF with cosine similarity between the basis matrices is proposed. The experimental comparison shows the efficacy of the proposed cosine similarity penalty in supervised audio source separation.
Takashi ISHIO Naoto MAEDA Kensuke SHIBUYA Kenho IWAMOTO Katsuro INOUE
Software developers may write a number of similar source code fragments including the same mistake in software products. To remove such faulty code fragments, developers inspect code clones if they found a bug in their code. While various code clone detection methods have been proposed to identify clones of either code blocks or functions, those tools do not always fit the code inspection task because a faulty code fragment may be much smaller than code blocks, e.g. a single line of code. To enable developers to search code clones of such a small faulty code fragment in a large-scale software product, we propose a method using Lempel-Ziv Jaccard Distance, which is an approximation of Normalized Compression Distance. We conducted an experiment using an existing research dataset and a user survey in a company. The result shows our method efficiently reports cloned faulty code fragments and the performance is acceptable for software developers.
Ryota YOSHIMURA Ichiro MARUTA Kenji FUJIMOTO Ken SATO Yusuke KOBAYASHI
Particle filters have been widely used for state estimation problems in nonlinear and non-Gaussian systems. Their performance depends on the given system and measurement models, which need to be designed by the user for each target system. This paper proposes a novel method to design these models for a particle filter. This is a numerical optimization method, where the particle filter design process is interpreted into the framework of reinforcement learning by assigning the randomnesses included in both models of the particle filter to the policy of reinforcement learning. In this method, estimation by the particle filter is repeatedly performed and the parameters that determine both models are gradually updated according to the estimation results. The advantage is that it can optimize various objective functions, such as the estimation accuracy of the particle filter, the variance of the particles, the likelihood of the parameters, and the regularization term of the parameters. We derive the conditions to guarantee that the optimization calculation converges with probability 1. Furthermore, in order to show that the proposed method can be applied to practical-scale problems, we design the particle filter for mobile robot localization, which is an essential technology for autonomous navigation. By numerical simulations, it is demonstrated that the proposed method further improves the localization accuracy compared to the conventional method.
A variety of smart services are being provided on multiple virtual networks embedded into a common inter-cloud substrate network. The substrate network operator deploys critical substrate nodes so that multiple service providers can achieve enhanced services due to the secure sharing of their service data. Even if one of the critical substrate nodes incurs damage, resiliency of the enhanced services can be assured due to reallocation of the workload and periodic backup of the service data to the other normal critical substrate nodes. However, the connectivity of the embedded virtual networks must be maintained so that the enhanced services can be continuously provided to all clients on the virtual networks. This paper considers resilient virtual network embedding (VNE) that ensures the connectivity of the embedded virtual networks after critical substrate node failures have occurred. The resilient VNE problem is formulated using an integer linear programming model and a distance-based method is proposed to solve the large-scale resilient VNE problem efficiently. Simulation results demonstrate that the distance-based method can derive a sub-optimum VNE solution with a small computational effort. The method derived a VNE solution with an approximation ratio of less than 1.2 within ten seconds in all the simulation experiments.
Rubin ZHAO Xiaolong ZHENG Zhihua YING Lingyan FAN
Most existing object detection methods and text detection methods are mainly designed to detect either text or objects. In some scenarios where the task is to find the target word pointed-at by an object, results of existing methods are far from satisfying. However, such scenarios happen often in human-computer interaction, when the computer needs to figure out which word the user is pointing at. Comparing with object detection, pointed-at word localization (PAWL) requires higher accuracy, especially in dense text scenarios. Moreover, in printed document, characters are much smaller than those in scene text detection datasets such as ICDAR-2013, ICDAR-2015 and ICPR-2018 etc. To address these problems, the authors propose a novel target word localization network (TWLN) to detect the pointed-at word in printed documents. In this work, a single deep neural network is trained to extract the features of markers and text sequentially. For each image, the location of the marker is predicted firstly, according to the predicted location, a smaller image is cropped from the original image and put into the same network, then the location of pointed-at word is predicted. To train and test the networks, an efficient approach is proposed to generate the dataset from PDF format documents by inserting markers pointing at the words in the documents, which avoids laborious labeling work. Experiments on the proposed dataset demonstrate that TWLN outperforms the compared object detection method and optical character recognition method on every category of targets, especially when the target is a single character that only occupies several pixels in the image. TWLN is also tested with real photographs, and the accuracy shows no significant differences, which proves the validity of the generating method to construct the dataset.
Kento SUGIURA Yoshiharu ISHIKAWA
With the rapid increase in the number of CPU cores, software that can utilize these many cores is required. A lock-free algorithm based on compare-and-swap (CAS) operations is one of the concurrency control methods to implement such multi-threading software. A multi-word CAS (MwCAS) operation is an extension of a CAS operation to swap multiple words atomically. However, we noticed that the performance of the existing MwCAS implementation is limited because of garbage collection even if in a low-contention environment. To achieve high performance in low-contention workloads, we propose a new MwCAS algorithm without garbage collection. Experimental results show that our approach is three to five times faster than implementation with garbage collection in low-contention workloads. Moreover, the performance of the proposed method is also superior in a high-contention environment.