Yaying SHEN Qun LI Ding XU Ziyi ZHANG Rui YANG
A triple loss based framework for generalized zero-shot learning is presented in this letter. The approach learns a shared latent space for image features and attributes by using aligned variational autoencoders and variants of triplet loss. Then we train a classifier in the latent space. The experimental results demonstrate that the proposed framework achieves great improvement.
The spectral envelope parameter is a significant speech parameter in the vocoder's quality. Recently, the Vector Quantized Variational AutoEncoder (VQ-VAE) is a state-of-the-art end-to-end quantization method based on the deep learning model. This paper proposed a new technique for improving the embedding space learning of VQ-VAE with the Generative Adversarial Network for quantizing the spectral envelope parameter, called VQ-VAE-EMGAN. In experiments, we designed the quantizer for the spectral envelope parameters of the WORLD vocoder extracted from the 16kHz speech waveform. As the results shown, the proposed technique reduced the Log Spectral Distortion (LSD) around 0.5dB and increased the PESQ by around 0.17 on average for four target bit operations compared to the conventional VQ-VAE.
Chongchong GU Haoyang XU Nan YAO Shengming JIANG Zhichao ZHENG Ruoyu FENG Yanli XU
In a wireless ad hoc network (MANET), nodes can form a centerless, self-organizing, multi-hop dynamic network without any centralized control function, while hidden and exposed terminals seriously affect the network performance. Meanwhile, the wireless network node is evolving from single communication function to jointly providing a self precise positioning function, especially in indoor environments where GPS cannot work well. However, the existing medium access control (MAC) protocols only deal with collision control for data transmission without positioning function. In this paper, we propose a MAC protocol based on 802.11 standard to enable a node with self-positioning function, which is further used to solve hidden and exposed terminal problems. The location of a network node is obtained through exchanging of MAC frames jointly using a time of arrival (TOA) algorithm. Then, nodes use the location information to calculate the interference range, and judge whether they can transmit concurrently. Simulation shows that the positioning function of the proposed MAC protocol works well, and the corresponding MAC protocol can achieve higher throughput, lower average end-to-end delay and lower packet loss rate than that without self-localization function.
In this paper, the scattering far-field from a circular electric conducting cylinder has been analyzed by physical optics (PO) approximation for both H and E polarizations. The evaluation of radiation integrations due to the PO current is conducted numerically and analytically. While non-uniform and uniform asymptotic solutions have been derived by the saddle point method, a separate approximation has been made for forward scattering direction. Comparisons among our approximation, direct numerical integration and exact solution results yield a good agreement for electrically large cylinders.
Hiroya YAMAMOTO Daichi KITAHARA Hiroki KURODA Akira HIRABAYASHI
This paper addresses single image super-resolution (SR) based on convolutional neural networks (CNNs). It is known that recovery of high-frequency components in output SR images of CNNs learned by the least square errors or least absolute errors is insufficient. To generate realistic high-frequency components, SR methods using generative adversarial networks (GANs), composed of one generator and one discriminator, are developed. However, when the generator tries to induce the discriminator's misjudgment, not only realistic high-frequency components but also some artifacts are generated, and objective indices such as PSNR decrease. To reduce the artifacts in the GAN-based SR methods, we consider the set of all SR images whose square errors between downscaling results and the input image are within a certain range, and propose to apply the metric projection onto this consistent set in the output layers of the generators. The proposed technique guarantees the consistency between output SR images and input images, and the generators with the proposed projection can generate high-frequency components with few artifacts while keeping low-frequency ones as appropriate for the known noise level. Numerical experiments show that the proposed technique reduces artifacts included in the original SR images of a GAN-based SR method while generating realistic high-frequency components with better PSNR values in both noise-free and noisy situations. Since the proposed technique can be integrated into various generators if the downscaling process is known, we can give the consistency to existing methods with the input images without degrading other SR performance.
A novel jig structure for S11 calibration with short/open conditions and one reference material (referred to here as SOM) in dielectric measurement of liquids using a coaxial feed type stepped cut-off circular waveguide and a formula for exact calculation of S11 for the analytical model of the structure using the method of moments (MoM) was proposed. The accuracy and validity of S11 values calculated using the relevant formula was then verified for frequencies of 0.50, 1.5 and 3.0 GHz, and S11 measurement accuracy with each termination condition was verified after calibration with SOM by combining the jig of the proposed structure with the study's electromagnetic (EM) analysis method. The relative complex permittivity was then estimated from S11 values measured with various liquids in the jig after calibration, and differences in results obtained with the proposed method and the conventional jig, the analytical model and the EM analysis method were examined. The validity of the proposed dielectric measurement method based on a combination of the above jig structure, numerical S11 calculation and the calibration method was thus confirmed.
The energy efficiency of intelligent reflecting surface (IRS) enabled internet of things (IoT) networks is studied in this letter. The energy efficiency is mathematically expressed, respectively, as the number of reflecting elements and the spectral efficiency of the network and is shown to scale in the logarithm of the reflecting elements number in the high regime of transmit power from source node. Furthermore, it is revealed that the energy efficiency scales linearly over the spectral efficiency in the high regime of transmit power, in contrast to conventional studies on energy and spectral efficiency trade-offs in the non-IRS wireless IoT networks. Numerical simulations are carried out to verify the derived results for the IRS enabled IoT networks.
Kanghui ZHAO Tao LU Yanduo ZHANG Yu WANG Yuanzhi WANG
In recent years, compared with the traditional face super-resolution (SR) algorithm, the face SR based on deep neural network has shown strong performance. Among these methods, attention mechanism has been widely used in face SR because of its strong feature expression ability. However, the existing attention-based face SR methods can not fully mine the missing pixel information of low-resolution (LR) face images (structural prior). And they only consider a single attention mechanism to take advantage of the structure of the face. The use of multi-attention could help to enhance feature representation. In order to solve this problem, we first propose a new pixel attention mechanism, which can recover the structural details of lost pixels. Then, we design an attention fusion module to better integrate the different characteristics of triple attention. Experimental results on FFHQ data sets show that this method is superior to the existing face SR methods based on deep neural network.
Seiji KOZAKI Akiko NAGASAWA Takeshi SUEHIRO Kenichi NAKURA Hiroshi MINENO
In this paper, a novel method of resource abstraction and an abstracted-resource model for dynamic resource control in optical access networks are proposed. Based on this proposal, an implementation assuming application to 5G mobile fronthaul and backhaul is presented. Finally, an evaluation of the processing time for resource allocation using this method is performed using a software prototype of the control function. From the results of the evaluation, it is confirmed that the proposed method offers better characteristics than former approaches, and is suitable for dynamic resource control in 5G applications.
Ying ZHANG Fandong MENG Jinchao ZHANG Yufeng CHEN Jinan XU Jie ZHOU
Machine reading comprehension with multi-hop reasoning always suffers from reasoning path breaking due to the lack of world knowledge, which always results in wrong answer detection. In this paper, we analyze what knowledge the previous work lacks, e.g., dependency relations and commonsense. Based on our analysis, we propose a Multi-dimensional Knowledge enhanced Graph Network, named MKGN, which exploits specific knowledge to repair the knowledge gap in reasoning process. Specifically, our approach incorporates not only entities and dependency relations through various graph neural networks, but also commonsense knowledge by a bidirectional attention mechanism, which aims to enhance representations of both question and contexts. Besides, to make the most of multi-dimensional knowledge, we investigate two kinds of fusion architectures, i.e., in the sequential and parallel manner. Experimental results on HotpotQA dataset demonstrate the effectiveness of our approach and verify that using multi-dimensional knowledge, especially dependency relations and commonsense, can indeed improve the reasoning process and contribute to correct answer detection.
Xiang SHEN Dezhi HAN Chin-Chen CHANG Liang ZONG
Visual Question Answering (VQA) is multi-task research that requires simultaneous processing of vision and text. Recent research on the VQA models employ a co-attention mechanism to build a model between the context and the image. However, the features of questions and the modeling of the image region force irrelevant information to be calculated in the model, thus affecting the performance. This paper proposes a novel dual self-guided attention with sparse question networks (DSSQN) to address this issue. The aim is to avoid having irrelevant information calculated into the model when modeling the internal dependencies on both the question and image. Simultaneously, it overcomes the coarse interaction between sparse question features and image features. First, the sparse question self-attention (SQSA) unit in the encoder calculates the feature with the highest weight. From the self-attention learning of question words, the question features of larger weights are reserved. Secondly, sparse question features are utilized to guide the focus on image features to obtain fine-grained image features, and to also prevent irrelevant information from being calculated into the model. A dual self-guided attention (DSGA) unit is designed to improve modal interaction between questions and images. Third, the sparse question self-attention of the parameter δ is optimized to select these question-related object regions. Our experiments with VQA 2.0 benchmark datasets demonstrate that DSSQN outperforms the state-of-the-art methods. For example, the accuracy of our proposed model on the test-dev and test-std is 71.03% and 71.37%, respectively. In addition, we show through visualization results that our model can pay more attention to important features than other advanced models. At the same time, we also hope that it can promote the development of VQA in the field of artificial intelligence (AI).
A rep-cube is a polyomino that is a net of a cube, and it can be divided into some polyominoes such that each of them can be folded into a cube. This notion was invented in 2017, which is inspired by the notions of polyomino and rep-tile, which were introduced by Solomon W. Golomb. A rep-cube is called regular if it can be divided into the nets of the same area. A regular rep-cube is of order k if it is divided into k nets. Moreover, it is called uniform if it can be divided into the congruent nets. In this paper, we focus on these special rep-cubes and solve several open problems.
Nur Syafiera Azreen NORODIN Kousuke NAKAMURA Masashi HOTTA
To realize a stable and efficient wireless power transfer (WPT) system that can be used in any environment, it is necessary to inspect the influence of environmental interference along the power transmission path of the WPT system. In this paper, attempts have been made to reduce the influence of the medium with a dielectric and conductive loss on the WPT system using spiral resonators for resonator-coupled type wireless power transfer (RC-WPT) system. An important element of the RC-WPT system is the resonators because they improve resonant characteristics by changing the shape or combination of spiral resonators to confine the electric field that mainly causes electrical loss in the system as much as possible inside the resonator. We proposed a novel dual-spiral resonator as a candidate and compared the basic characteristics of the RC-WPT system with conventional single-spiral and dual-spiral resonators. The parametric values of the spiral resonators, such as the quality factors and the coupling coefficients between resonators with and without a lossy medium in the power transmission path, were examined. For the lossy mediums, pure water or tap water filled with acryl bases was used. The maximum transmission efficiency of the RC-WPT system was then observed by tuning the matching condition of the system. Following that, the transmission efficiency of the system with and without lossy medium was investigated. These inspections revealed that the performance of the RC-WPT system with the lossy medium using the modified shape spiral resonator, which is the dual-spiral resonator proposed in our laboratory, outperformed the system using the conventional single-spiral resonator.
Tatsuya INOHA Kunihiko SADAKANE Yushi UNO Yuma YONEBAYASHI
Betweenness centrality is one of the most significant and commonly used centralities, where centrality is a notion of measuring the importance of nodes in networks. In 2001, Brandes proposed an algorithm for computing betweenness centrality efficiently, and it can compute those values for all nodes in O(nm) time for unweighted networks, where n and m denote the number of nodes and links in networks, respectively. However, even Brandes' algorithm is not fast enough for recent large-scale real-world networks, and therefore, much faster algorithms are expected. The objective of this research is to theoretically improve the efficiency of Brandes' algorithm by introducing graph decompositions, and to verify the practical effectiveness of our approaches by implementing them as computer programs and by applying them to various kinds of real-world networks. A series of computational experiments shows that our proposed algorithms run several times faster than the original Brandes' algorithm, which are guaranteed by theoretical analyses.
Abdul Hayee SHAIKH Xiaoyu DANG Imran A. KHOSO Daqing HUANG
A three-stage padding configuration providing a larger continuous virtual aperture and achieving more degrees-of-freedom (DOFs) for the direction-of-arrival (DOA) estimation is presented. The improvement is realized by appropriately cascading three-stages of an identical inter-element spacing. Each stage advantageously exhibits a continuous virtual array, which subsequently produces a hole-free resulting uniform linear array. The geometrical approach remains applicable for any existing sparse array structures with a hole-free coarray, as well as designed in the future. In addition to enlarging the continuous virtual aperture and DOFs, the proposed design offers flexibility so that it can be realized for any given number of antennas. Moreover, a special padding configuration is demonstrated, which further increases the number of continuous virtual sensors. The precise antenna locations and the number of continuous virtual positions are benefited from the closed-form expressions. Experimental works are carried out to demonstrate the effectiveness of the proposed configuration.
The challenges posed by big data in the 21st Century are complex: Under the previous common sense, we considered that polynomial-time algorithms are practical; however, when we handle big data, even a linear-time algorithm may be too slow. Thus, sublinear- and constant-time algorithms are required. The academic research project, “Foundations of Innovative Algorithms for Big Data,” which was started in 2014 and will finish in September 2021, aimed at developing various techniques and frameworks to design algorithms for big data. In this project, we introduce a “Sublinear Computation Paradigm.” Toward this purpose, we first provide a survey of constant-time algorithms, which are the most investigated framework of this area, and then present our recent results on sublinear progressive algorithms. A sublinear progressive algorithm first outputs a temporary approximate solution in constant time, and then suggests better solutions gradually in sublinear-time, finally finds the exact solution. We present Sublinear Progressive Algorithm Theory (SPA Theory, for short), which enables to make a sublinear progressive algorithm for any property if it has a constant-time algorithm and an exact algorithm (an exponential-time one is allowed) without losing any computation time in the big-O sense.
Masaki TAKANASHI Shu-ichi SATO Kentaro INDO Nozomu NISHIHARA Hiroki HAYASHI Toru SUZUKI
The prediction of the malfunction timing of wind turbines is essential for maintaining the high profitability of the wind power generation industry. Studies have been conducted on machine learning methods that use condition monitoring system data, such as vibration data, and supervisory control and data acquisition (SCADA) data to detect and predict anomalies in wind turbines automatically. Autoencoder-based techniques that use unsupervised learning where the anomaly pattern is unknown have attracted significant interest in the area of anomaly detection and prediction. In particular, vibration data are considered useful because they include the changes that occur in the early stages of a malfunction. However, when autoencoder-based techniques are applied for prediction purposes, in the training process it is difficult to distinguish the difference between operating and non-operating condition data, which leads to the degradation of the prediction performance. In this letter, we propose a method in which both vibration data and SCADA data are utilized to improve the prediction performance, namely, a method that uses a power curve composed of active power and wind speed. We evaluated the method's performance using vibration and SCADA data obtained from an actual wind farm.
TaiYu CHENG Yutaka MASUDA Jun NAGAYAMA Yoichi MOMIYAMA Jun CHEN Masanori HASHIMOTO
Reducing power consumption is a crucial factor making industrial designs, such as mobile SoCs, competitive. Voltage scaling (VS) is the classical yet most effective technique that contributes to quadratic power reduction. A recent design technique called activation-aware slack assignment (ASA) enhances the voltage-scaling by allocating the timing margin of critical paths with a stochastic mean-time-to-failure (MTTF) analysis. Meanwhile, such stochastic treatment of timing errors is accepted in limited application domains, such as image processing. This paper proposes a design optimization methodology that achieves a mode-wise voltage-scalable (MWVS) design guaranteeing no timing errors in each mode operation. This work formulates the MWVS design as an optimization problem that minimizes the overall power consumption considering each mode duration, achievable voltage lowering and accompanied circuit overhead explicitly, and explores the solution space with the downhill simplex algorithm that does not require numerical derivation and frequent objective function evaluations. For obtaining a solution, i.e., a design, in the optimization process, we exploit the multi-corner multi-mode design flow in a commercial tool for performing mode-wise ASA with sets of false paths dedicated to individual modes. We applied the proposed design methodology to RISC-V design. Experimental results show that the proposed methodology saves 13% to 20% more power compared to the conventional VS approach and attains 8% to 15% gain from the conventional single-mode ASA. We also found that cycle-by-cycle fine-grained false path identification reduced leakage power by 31% to 42%.
Taito MANABE Taichi KATAYAMA Yuichiro SHIBATA
Line detection is the fundamental image processing technique which has various applications in the field of computer vision. For example, lane keeping required to realize autonomous vehicles can be implemented based on line detection technique. For such purposes, however, low detection latency and power consumption are essential. Using hardware-based stream processing is considered as an effective way to achieve such properties since it eliminates the need of storing the whole frame into energy-consuming external memory. In addition, adopting FPGAs enables us to keep flexibility of software processing. The line segment detector (LSD) is the algorithm based on intensity gradient, and performs better than the well-known Hough transform in terms of processing speed and accuracy. However, implementing the original LSD on FPGAs as a pipeline structure is difficult mainly because of its iterative region growing approach. Therefore, we propose a simple and stream-friendly line segment detection algorithm based on the concept of LSD. The whole system is implemented on a Xilinx Zynq-7000 XC7Z020-1CLG400C FPGA without any external memory. Evaluation results reveal that the implemented system is able to detect line segments successfully and is compact with 7.5% of Block RAM and less than 7.0% of the other resources used, while maintaining 60 fps throughput for VGA videos. It is also shown that the system is power-efficient compared to software processing on CPUs.
Hikaru TSUCHIDA Takashi NISHIDE Yusaku MAEDA
Multiparty computation (MPC) is the technology that computes an arbitrary function represented as a circuit without revealing input values. Typical MPC uses secret sharing (SS) schemes, garbled circuit (GC), and homomorphic encryption (HE). These cryptographic technologies have a trade-off relationship for the computation cost, communication cost, and type of computable circuit. Hence, the optimal choice depends on the computing resources, communication environment, and function related to applications. The private decision tree evaluation (PDTE) is one of the important applications of secure computation. There exist several PDTE protocols with constant communication rounds using GC, HE, and SS-MPC over the field. However, to the best of our knowledge, PDTE protocols with constant communication rounds using MPC based on SS over the ring (requiring only lower computation costs and communication complexity) are non-trivial and still missing. In this paper, we propose a PDTE protocol based on a three-party computation (3PC) protocol over the ring with one corruption. We also propose another three-party PDTE protocol over the field with one corruption that is more efficient than the naive construction.