The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] ATI(18690hit)


  • Triple Loss Based Framework for Generalized Zero-Shot Learning

    Yaying SHEN  Qun LI  Ding XU  Ziyi ZHANG  Rui YANG  

    LETTER-Image Recognition, Computer Vision

    E105-D No:4

    A triple loss based framework for generalized zero-shot learning is presented in this letter. The approach learns a shared latent space for image features and attributes by using aligned variational autoencoders and variants of triplet loss. Then we train a classifier in the latent space. The experimental results demonstrate that the proposed framework achieves great improvement.

  • Speaker-Independent Audio-Visual Speech Separation Based on Transformer in Multi-Talker Environments

    Jing WANG  Yiyu LUO  Weiming YI  Xiang XIE  

    PAPER-Speech and Hearing

    E105-D No:4

    Speech separation is the task of extracting target speech while suppressing background interference components. In applications like video telephones, visual information about the target speaker is available, which can be leveraged for multi-speaker speech separation. Most previous multi-speaker separation methods are mainly based on convolutional or recurrent neural networks. Recently, Transformer-based Seq2Seq models have achieved state-of-the-art performance in various tasks, such as neural machine translation (NMT), automatic speech recognition (ASR), etc. Transformer has showed an advantage in modeling audio-visual temporal context by multi-head attention blocks through explicitly assigning attention weights. Besides, Transformer doesn't have any recurrent sub-networks, thus supporting parallelization of sequence computation. In this paper, we propose a novel speaker-independent audio-visual speech separation method based on Transformer, which can be flexibly applied to unknown number and identity of speakers. The model receives both audio-visual streams, including noisy spectrogram and speaker lip embeddings, and predicts a complex time-frequency mask for the corresponding target speaker. The model is made up by three main components: audio encoder, visual encoder and Transformer-based mask generator. Two different structures of encoders are investigated and compared, including ResNet-based and Transformer-based. The performance of the proposed method is evaluated in terms of source separation and speech quality metrics. The experimental results on the benchmark GRID dataset show the effectiveness of the method on speaker-independent separation task in multi-talker environments. The model generalizes well to unseen identities of speakers and noise types. Though only trained on 2-speaker mixtures, the model achieves reasonable performance when tested on 2-speaker and 3-speaker mixtures. Besides, the model still shows an advantage compared with previous audio-visual speech separation works.

  • Use of Cyclic-Delay Diversity (CDD) with Modified Channel Estimation for FER Improvement in OFDM Downlink

    Masafumi MORIYAMA  Kenichi TAKIZAWA  Hayato TEZUKA  Fumihide KOJIMA  

    PAPER-Wireless Communication Technologies

    E105-B No:3

    High reliability is required, even in Internet of things (IoT) communications, which are sometimes used for crucial control such as automatic driving devices. Hence, both the uplink (UL) and downlink (DL) communication quality must be improved in the physical layer. In this study, we focus on the communication quality of broadcast DL, which is configured using orthogonal frequency-division multiplexing (OFDM) as a multiplexing scheme and turbo code as forward error correction (FEC). To reduce the frame-error rate (FER) in the DL, we consider two transmit-diversity (TD) techniques that use space-time block code (STBC) or cyclic-delay diversity (CDD). The purpose of this paper is to evaluate the TD performance and to enhance FER performance of CDD up to that of STBC. To achieve this goal, a channel estimation method is proposed to improve FER for CDD. For this purpose, we first evaluate the FER performance of STBC and CDD by performing computer simulations and conducting hardware tests using a fading emulator. Then, we conduct field experiments in the 2.5GHz band. From the results of these evaluations, we confirm that STBC and CDD improved FER compared with single antenna transmission. CDD with the proposed channel estimation method achieved almost the same performance as STBC by accurately estimating the channel frequency response (CFR) and appropriately adjusting the amount of cyclic shift (ACS). When moving a received device around Yokosuka Research Park, STBC and CDD, using spatial diversity with omni antennas for TD, improved the FER from 3.84×10-2 to 1.42×10-2 and 1.19×10-2, respectively.

  • A Compact and High-Resolution CMOS Switch-Type Phase Shifter Achieving 0.4-dB RMS Gain Error for 5G n260 Band

    Jian PANG  Xueting LUO  Zheng LI  Atsushi SHIRANE  Kenichi OKADA  

    PAPER-Microwaves, Millimeter-Waves

    E105-C No:3

    This paper introduces a high-resolution and compact CMOS switch-type phase shifter (STPS) for the 5th generation mobile network (5G) n260 band. In this work, totally four coarse phase shifting stages and a high-resolution tuning stage are included. The coarse stages based on the bridged-T topology is capable of providing 202.5° phase coverage with a 22.5° tuning step. To further improve the phase shifting resolution, a compact fine-tuning stage covering 23° is also integrated with the coarse stages. Sub-degree phase shifting resolution is realized for supporting the fine beam-steering and high-accuracy phase calibration in the 5G new radio. Simplified phase control algorithm and suppressed insertion loss can also be maintained by the proposed fine-tuning stage. In the measurement, the achieved RMS gain errors at 39 GHz are 0.1 dB and 0.4 dB for the coarse stages and fine stage, respectively. The achieved RMS phase errors at 39 GHz are 3.1° for the coarse stages and 0.1° for the fine stage. Within 37 GHz to 40 GHz, the measured return loss within all phase-tuning states is always better than -14 dB. The proposed phase shifter consumes a core area of only 0.12mm2 with 65-nm CMOS process, which is area-efficient.

  • An Efficient Secure Division Protocol Using Approximate Multi-Bit Product and New Constant-Round Building Blocks Open Access

    Keitaro HIWATASHI  Satsuya OHATA  Koji NUIDA  

    PAPER-Cryptography and Information Security

    E105-A No:3

    Integer division is one of the most fundamental arithmetic operators and is ubiquitously used. However, the existing division protocols in secure multi-party computation (MPC) are inefficient and very complex, and this has been a barrier to applications of MPC such as secure machine learning. We already have some secure division protocols working in Z2n. However, these existing results have drawbacks that those protocols needed many communication rounds and needed to use bigger integers than in/output. In this paper, we improve a secure division protocol in two ways. First, we construct a new protocol using only the same size integers as in/output. Second, we build efficient constant-round building blocks used as subprotocols in the division protocol. With these two improvements, communication rounds of our division protocol are reduced to about 36% (87 rounds → 31 rounds) for 64-bit integers in comparison with the most efficient previous one.

  • GPGPU Implementation of Variational Bayesian Gaussian Mixture Models

    Hiroki NISHIMOTO  Renyuan ZHANG  Yasuhiko NAKASHIMA  

    PAPER-Fundamentals of Information Systems

    E105-D No:3

    The efficient implementation strategy for speeding up high-quality clustering algorithms is developed on the basis of general purpose graphic processing units (GPGPUs) in this work. Among various clustering algorithms, a sophisticated Gaussian mixture model (GMM) by estimating parameters through variational Bayesian (VB) mechanism is conducted due to its superior performances. Since the VB-GMM methodology is computation-hungry, the GPGPU is employed to carry out massive matrix-computations. To efficiently migrate the conventional CPU-oriented schemes of VB-GMM onto GPGPU platforms, an entire migration-flow with thirteen stages is presented in detail. The CPU-GPGPU co-operation scheme, execution re-order, and memory access optimization are proposed for optimizing the GPGPU utilization and maximizing the clustering speed. Five types of real-world applications along with relevant data-sets are introduced for the cross-validation. From the experimental results, the feasibility of implementing VB-GMM algorithm by GPGPU is verified with practical benefits. The proposed GPGPU migration achieves 192x speedup in maximum. Furthermore, it succeeded in identifying the proper number of clusters, which is hardly conducted by the EM-algotihm.

  • Driver Status Monitoring System with Body Channel Communication Technique Using Conductive Thread Electrodes

    Beomjin YUK  Byeongseol KIM  Soohyun YOON  Seungbeom CHOI  Joonsung BAE  

    PAPER-Wireless Communication Technologies

    E105-B No:3

    This paper presents a driver status monitoring (DSM) system with body channel communication (BCC) technology to acquire the driver's physiological condition. Specifically, a conductive thread, the receiving electrode, is sewn to the surface of the seat so that the acquired signal can be continuously detected. As a signal transmission medium, body channel characteristics using the conductive thread electrode were investigated according to the driver's pose and the material of the driver's pants. Based on this, a BCC transceiver was implemented using an analog frequency modulation (FM) scheme to minimize the additional circuitry and system cost. We analyzed the heart rate variability (HRV) from the driver's electrocardiogram (ECG) and displayed the heart rate and Root Mean Square of Successive Differences (RMSSD) values together with the ECG waveform in real-time. A prototype of the DSM system with commercial-off-the-shelf (COTS) technology was implemented and tested. We verified that the proposed approach was robust to the driver's movements, showing the feasibility and validity of the DSM with BCC technology using a conductive thread electrode.

  • Research on Dissections of a Net of a Cube into Nets of Cubes

    Tamami OKADA  Ryuhei UEHARA  


    E105-D No:3

    A rep-cube is a polyomino that is a net of a cube, and it can be divided into some polyominoes such that each of them can be folded into a cube. This notion was invented in 2017, which is inspired by the notions of polyomino and rep-tile, which were introduced by Solomon W. Golomb. A rep-cube is called regular if it can be divided into the nets of the same area. A regular rep-cube is of order k if it is divided into k nets. Moreover, it is called uniform if it can be divided into the congruent nets. In this paper, we focus on these special rep-cubes and solve several open problems.

  • User Identification and Channel Estimation by Iterative DNN-Based Decoder on Multiple-Access Fading Channel Open Access

    Lantian WEI  Shan LU  Hiroshi KAMABE  Jun CHENG  

    PAPER-Communication Theory and Signals

    E105-A No:3

    In the user identification (UI) scheme for a multiple-access fading channel based on a randomly generated (0, 1, -1)-signature code, previous studies used the signature code over a noisy multiple-access adder channel, and only the user state information (USI) was decoded by the signature decoder. However, by considering the communication model as a compressed sensing process, it is possible to estimate the channel coefficients while identifying users. In this study, to improve the efficiency of the decoding process, we propose an iterative deep neural network (DNN)-based decoder. Simulation results show that for the randomly generated (0, 1, -1)-signature code, the proposed DNN-based decoder requires less computing time than the classical signal recovery algorithm used in compressed sensing while achieving higher UI and channel estimation (CE) accuracies.

  • Spatial Vectors Effective for Nakagami-m Fading MIMO Channels Open Access

    Tatsumi KONISHI  Hiroyuki NAKANO  Yoshikazu YANO  Michihiro AOKI  

    LETTER-Communication Theory and Signals

    E105-A No:3

    This letter proposes a transmission scheme called spatial vector (SV), which is effective for Nakagami-m fading multiple-input multiple-output channels. First, the analytical error rate of SV is derived for Nakagami-m fading MIMO channels. Next, an example of SV called integer SV (ISV) is introduced. The error performance was evaluated over Nakagami-m fading from m = 1 to m = 50 and compared with spatial modulation (SM), enhanced SM, and quadrature SM. The results show that for m > 1, ISV outperforms the SM schemes and is robust to m variations.

  • Adversarial Scan Attack against Scan Matching Algorithm for Pose Estimation in LiDAR-Based SLAM Open Access

    Kota YOSHIDA  Masaya HOJO  Takeshi FUJINO  


    E105-A No:3

    Autonomous robots are controlled using physical information acquired by various sensors. The sensors are susceptible to physical attacks, which tamper with the observed values and interfere with control of the autonomous robots. Recently, sensor spoofing attacks targeting subsequent algorithms which use sensor data have become large threats. In this paper, we introduce a new attack against the LiDAR-based simultaneous localization and mapping (SLAM) algorithm. The attack uses an adversarial LiDAR scan to fool a pose graph and a generated map. The adversary calculates a falsification amount for deceiving pose estimation and physically injects the spoofed distance against LiDAR. The falsification amount is calculated by gradient method against a cost function of the scan matching algorithm. The SLAM algorithm generates the wrong map from the deceived movement path estimated by scan matching. We evaluated our attack on two typical scan matching algorithms, iterative closest point (ICP) and normal distribution transform (NDT). Our experimental results show that SLAM can be fooled by tampering with the scan. Simple odometry sensor fusion is not a sufficient countermeasure. We argue that it is important to detect or prevent tampering with LiDAR scans and to notice inconsistencies in sensors caused by physical attacks.

  • A Polynomial Delay Algorithm for Enumerating 2-Edge-Connected Induced Subgraphs

    Taishu ITO  Yusuke SANO  Katsuhisa YAMANAKA  Takashi HIRAYAMA  


    E105-D No:3

    The problem of enumerating connected induced subgraphs of a given graph is classical and studied well. It is known that connected induced subgraphs can be enumerated in constant time for each subgraph. In this paper, we focus on highly connected induced subgraphs. The most major concept of connectivity on graphs is vertex connectivity. For vertex connectivity, some enumeration problem settings and enumeration algorithms have been proposed, such as k-vertex connected spanning subgraphs. In this paper, we focus on another major concept of graph connectivity, edge-connectivity. This is motivated by the problem of finding evacuation routes in road networks. In evacuation routes, edge-connectivity is important, since highly edge-connected subgraphs ensure multiple routes between two vertices. In this paper, we consider the problem of enumerating 2-edge-connected induced subgraphs of a given graph. We present an algorithm that enumerates 2-edge-connected induced subgraphs of an input graph G with n vertices and m edges. Our algorithm enumerates all the 2-edge-connected induced subgraphs in O(n3m|SG|) time, where SG is the set of the 2-edge-connected induced subgraphs of G. Moreover, by slightly modifying the algorithm, we have a O(n3m)-delay enumeration algorithm for 2-edge-connected induced subgraphs.

  • Android Malware Detection Based on Functional Classification

    Wenhao FAN  Dong LIU  Fan WU  Bihua TANG  Yuan'an LIU  

    PAPER-Artificial Intelligence, Data Mining

    E105-D No:3

    Android operating system occupies a high share in the mobile terminal market. It promotes the rapid development of Android applications (apps). However, the emergence of Android malware greatly endangers the security of Android smartphone users. Existing research works have proposed a lot of methods for Android malware detection, but they did not make the utilization of apps' functional category information so that the strong similarity between benign apps in the same functional category is ignored. In this paper, we propose an Android malware detection scheme based on the functional classification. The benign apps in the same functional category are more similar to each other, so we can use less features to detect malware and improve the detection accuracy in the same functional category. The aim of our scheme is to provide an automatic application functional classification method with high accuracy. We design an Android application functional classification method inspired by the hyperlink induced topic search (HITS) algorithm. Using the results of automatic classification, we further design a malware detection method based on app similarity in the same functional category. We use benign apps from the Google Play Store and use malware apps from the Drebin malware set to evaluate our scheme. The experimental results show that our method can effectively improve the accuracy of malware detection.

  • Three-Stage Padding Configuration for Sparse Arrays with Larger Continuous Virtual Aperture and Increased Degrees of Freedom

    Abdul Hayee SHAIKH  Xiaoyu DANG  Imran A. KHOSO  Daqing HUANG  

    PAPER-Analog Signal Processing

    E105-A No:3

    A three-stage padding configuration providing a larger continuous virtual aperture and achieving more degrees-of-freedom (DOFs) for the direction-of-arrival (DOA) estimation is presented. The improvement is realized by appropriately cascading three-stages of an identical inter-element spacing. Each stage advantageously exhibits a continuous virtual array, which subsequently produces a hole-free resulting uniform linear array. The geometrical approach remains applicable for any existing sparse array structures with a hole-free coarray, as well as designed in the future. In addition to enlarging the continuous virtual aperture and DOFs, the proposed design offers flexibility so that it can be realized for any given number of antennas. Moreover, a special padding configuration is demonstrated, which further increases the number of continuous virtual sensors. The precise antenna locations and the number of continuous virtual positions are benefited from the closed-form expressions. Experimental works are carried out to demonstrate the effectiveness of the proposed configuration.

  • Polarity Classification of Social Media Feeds Using Incremental Learning — A Deep Learning Approach


    PAPER-Neural Networks and Bioengineering

    E105-A No:3

    Online feeds are streamed continuously in batches with varied polarities at varying times. The system handling the online feeds must be trained to classify all the varying polarities occurring dynamically. The polarity classification system designed for the online feeds must address two significant challenges: i) stability-plasticity, ii) category-proliferation. The challenges faced in the polarity classification of online feeds can be addressed using the technique of incremental learning, which serves to learn new classes dynamically and also retains the previously learned knowledge. This paper proposes a new incremental learning methodology, ILOF (Incremental Learning of Online Feeds) to classify the feeds by adopting Deep Learning Techniques such as RNN (Recurrent Neural Networks) and LSTM (Long Short Term Memory) and also ELM (Extreme Learning Machine) for addressing the above stated problems. The proposed method creates a separate model for each batch using ELM and incrementally learns from the trained batches. The training of each batch avoids the retraining of old feeds, thus saving training time and memory space. The trained feeds can be discarded when new batch of feeds arrives. Experiments are carried out using the standard datasets comprising of long feeds (IMDB, Sentiment140) and short feeds (Twitter, WhatsApp, and Twitter airline sentiment) and the proposed method showed positive results in terms of better performance and accuracy.

  • A Localization Method Based on Partial Correlation Analysis for Dynamic Wireless Network Open Access

    Yuki HORIGUCHI  Yusuke ITO  Aohan LI  Mikio HASEGAWA  

    LETTER-Nonlinear Problems

    E105-A No:3

    Recent localization methods for wireless networks cannot be applied to dynamic networks with unknown topology. To solve this problem, we propose a localization method based on partial correlation analysis in this paper. We evaluate our proposed localization method in terms of accuracy, which shows that our proposed method can achieve high accuracy localization for dynamic networks with unknown topology.

  • Link Availability Prediction Based on Machine Learning for Opportunistic Networks in Oceans

    Lige GE  Shengming JIANG  Xiaowei WANG  Yanli XU  Ruoyu FENG  Zhichao ZHENG  

    LETTER-Reliability, Maintainability and Safety Analysis

    E105-A No:3

    Along with the fast development of blue economy, wireless communication in oceans has received extensive attention in recent years, and opportunistic networks without any aid from fixed infrastructure or centralized management are expected to play an important role in such highly dynamic environments. Here, link prediction can help nodes to select proper links for data forwarding to reduce transmission failure. The existing prediction schemes are mainly based on analytical models with no adaptability, and consider relatively simple and small terrestrial wireless networks. In this paper, we propose a new link prediction algorithm based on machine learning, which is composed of an extractor of convolutional layers and an estimator of long short-term memory to extract useful representations of time-series data and identify effective long-term dependencies. The experiments manifest that the proposed scheme is more effective and flexible compared with the other link prediction schemes.

  • Design of a Linear Layer for a Block Cipher Based on Type-2 Generalized Feistel Network with 32 Branches

    Kosei SAKAMOTO  Kazuhiko MINEMATSU  Nao SHIBATA  Maki SHIGERI  Hiroyasu KUBO  Takanori ISOBE  


    E105-A No:3

    In spite of the research for a linear layer of Type-2 Generalized Feistel Network (Type-2 GFN) over more than 10 years, finding a good 32-branch permutation for Type-2 GFN is still a very hard task due to a huge search space. In terms of the diffusion property, Suzaki and Minematsu investigated the required number of rounds to achieve the full diffusion when the branch number is up to 16. After that, Derbez et al. presented a class of 32-branch permutations that achieves the 9-round full diffusion and they prove that this is optimal. However, this class is not suitable to be used in Type-2 GFN because it requires a large number of rounds to ensure a sufficient number of active S-boxes. In this paper, we present how to find a good class of 32-branch permutations for Type-2 GFN. To achieve this goal, we convert Type-2 GFN into a LBlock-like structure, and then we evaluate the diffusion property and the resistance against major attacks, such as differential, linear, impossible differential and integral attacks by an MILP. As a result, we present a good class of 32-branch permutations that achieves the 10-round full diffusion, ensures differentially/linearly active S-boxes of 66 at 19 round, and has the 18/20-round impossible differential/integral distinguisher, respectively. The 32-branch permutation used in WARP was chosen among this class.

  • The Ratio of the Desired Parameters of Deep Neural Networks

    Yasushi ESAKI  Yuta NAKAHARA  Toshiyasu MATSUSHIMA  

    LETTER-Neural Networks and Bioengineering

    E105-A No:3

    There have been some researchers that investigate the accuracy of the approximation to a function that shows a generating pattern of data by a deep neural network. However, they have confirmed only whether at least one function close to the function showing a generating pattern exists in function classes of deep neural networks whose parameter values are changing. Therefore, we propose a new criterion to infer the approximation accuracy. Our new criterion shows the existence ratio of functions close to the function showing a generating pattern in the function classes. Moreover, we show a deep neural network with a larger number of layers approximates the function showing a generating pattern more accurately than one with a smaller number of layers under the proposed criterion, with numerical simulations.

  • Competent Triple Identification for Knowledge Graph Completion under the Open-World Assumption

    Esrat FARJANA  Natthawut KERTKEIDKACHORN  Ryutaro ICHISE  

    PAPER-Data Engineering, Web Information Systems

    E105-D No:3

    The usefulness and usability of existing knowledge graphs (KGs) are mostly limited because of the incompleteness of knowledge compared to the growing number of facts about the real world. Most existing ontology-based KG completion methods are based on the closed-world assumption, where KGs are fixed. In these methods, entities and relations are defined, and new entity information cannot be easily added. In contrast, in open-world assumptions, entities and relations are not previously defined. Thus there is a vast scope to find new entity information. Despite this, knowledge acquisition under the open-world assumption is challenging because most available knowledge is in a noisy unstructured text format. Nevertheless, Open Information Extraction (OpenIE) systems can extract triples, namely (head text; relation text; tail text), from raw text without any prespecified vocabulary. Such triples contain noisy information that is not essential for KGs. Therefore, to use such triples for the KG completion task, it is necessary to identify competent triples for KGs from the extracted triple set. Here, competent triples are the triples that can contribute to add new information to the existing KGs. In this paper, we propose the Competent Triple Identification (CTID) model for KGs. We also propose two types of feature, namely syntax- and semantic-based features, to identify competent triples from a triple set extracted by a state-of-the-art OpenIE system. We investigate both types of feature and test their effectiveness. It is found that the performance of the proposed features is about 20% better compared to that of the ReVerb system in identifying competent triples.
