IEICE global.ieice.org Site

Keyword Search Result

[Keyword] (42807hit)

3601-3620hit(42807hit)

Latent Words Recurrent Neural Network Language Models for Automatic Speech Recognition
Ryo MASUMURA Taichi ASAMI Takanobu OBA Sumitaka SAKAUCHI Akinori ITO

PAPER-Speech and Hearing

Pubricized:
2019/09/25
Vol:
E102-D No:12
Page(s):
2557-2567
This paper demonstrates latent word recurrent neural network language models (LW-RNN-LMs) for enhancing automatic speech recognition (ASR). LW-RNN-LMs are constructed so as to pick up advantages in both recurrent neural network language models (RNN-LMs) and latent word language models (LW-LMs). The RNN-LMs can capture long-range context information and offer strong performance, and the LW-LMs are robust for out-of-domain tasks based on the latent word space modeling. However, the RNN-LMs cannot explicitly capture hidden relationships behind observed words since a concept of a latent variable space is not present. In addition, the LW-LMs cannot take into account long-range relationships between latent words. Our idea is to combine RNN-LM and LW-LM so as to compensate individual disadvantages. The LW-RNN-LMs can support both a latent variable space modeling as well as LW-LMs and a long-range relationship modeling as well as RNN-LMs at the same time. From the viewpoint of RNN-LMs, LW-RNN-LM can be considered as a soft class RNN-LM with a vast latent variable space. In contrast, from the viewpoint of LW-LMs, LW-RNN-LM can be considered as an LW-LM that uses the RNN structure for latent variable modeling instead of an n-gram structure. This paper also details a parameter inference method and two kinds of implementation methods, an n-gram approximation and a Viterbi approximation, for introducing the LW-LM to ASR. Our experiments show effectiveness of LW-RNN-LMs on a perplexity evaluation for the Penn Treebank corpus and an ASR evaluation for Japanese spontaneous speech tasks.
Attentive Sequences Recurrent Network for Social Relation Recognition from Video Open Access
Jinna LV Bin WU Yunlei ZHANG Yunpeng XIAO

PAPER-Image Recognition, Computer Vision

Pubricized:
2019/09/02
Vol:
E102-D No:12
Page(s):
2568-2576
Recently, social relation analysis receives an increasing amount of attention from text to image data. However, social relation analysis from video is an important problem, which is lacking in the current literature. There are still some challenges: 1) it is hard to learn a satisfactory mapping function from low-level pixels to high-level social relation space; 2) how to efficiently select the most relevant information from noisy and unsegmented video. In this paper, we present an Attentive Sequences Recurrent Network model, called ASRN, to deal with the above challenges. First, in order to explore multiple clues, we design a Multiple Feature Attention (MFA) mechanism to fuse multiple visual features (i.e. image, motion, body, and face). Through this manner, we can generate an appropriate mapping function from low-level video pixels to high-level social relation space. Second, we design a sequence recurrent network based on Global and Local Attention (GLA) mechanism. Specially, an attention mechanism is used in GLA to integrate global feature with local sequence feature to select more relevant sequences for the recognition task. Therefore, the GLA module can better deal with noisy and unsegmented video. At last, extensive experiments on the SRIV dataset demonstrate the performance of our ASRN model.
Attention-Guided Spatial Transformer Networks for Fine-Grained Visual Recognition
Dichao LIU Yu WANG Jien KATO

PAPER-Image Recognition, Computer Vision

Pubricized:
2019/09/04
Vol:
E102-D No:12
Page(s):
2577-2586
The aim of this paper is to propose effective attentional regions for fine-grained visual recognition. Based on the Spatial Transformers' capability of spatial manipulation within networks, we propose an extension model, the Attention-Guided Spatial Transformer Networks (AG-STNs). This model can guide the Spatial Transformers with hard-coded attentional regions at first. Then such guidance can be turned off, and the network model will adjust the region learning in terms of the location and scale. Such adjustment is conditioned to the classification loss so that it is actually optimized for better recognition results. With this model, we are able to successfully capture detailed attentional information. Also, the AG-STNs are able to capture attentional information in multiple levels, and different levels of attentional information are complementary to each other in our experiments. A fusion of them brings better results.
A Stackelberg Game-Theoretic Solution to Win-Win Situation: A Presale Mechanism in Spectrum Market
Wei BAI Yuli ZHANG Meng WANG Jin CHEN Han JIANG Zhan GAO Donglin JIAO

LETTER-Information Network

Pubricized:
2019/08/28
Vol:
E102-D No:12
Page(s):
2607-2610
This paper investigates the spectrum allocation problem. Under the current spectrum management mode, large amount of spectrum resource is wasted due to uncertainty of user's demand. To reduce the impact of uncertainty, a presale mechanism is designed based on spectrum pool. In this mechanism, the spectrum manager provides spectrum resource at a favorable price for presale aiming at sharing with user the risk caused by uncertainty of demand. Because of the hierarchical characteristic, we build a spectrum market Stackelberg game, in which the manager acts as leader and user as follower. Then proof of the uniqueness and optimality of Stackelberg Equilibrium is given. Simulation results show the presale mechanism can promote profits for both sides and reduce temporary scheduling.
An Evolutionary Approach Based on Symmetric Nonnegative Matrix Factorization for Community Detection in Dynamic Networks
Yu PAN Guyu HU Zhisong PAN Shuaihui WANG Dongsheng SHAO

LETTER-Artificial Intelligence, Data Mining

Pubricized:
2019/09/02
Vol:
E102-D No:12
Page(s):
2619-2623
Detecting community structures and analyzing temporal evolution in dynamic networks are challenging tasks to explore the inherent characteristics of the complex networks. In this paper, we propose a semi-supervised evolutionary clustering model based on symmetric nonnegative matrix factorization to detect communities in dynamic networks, named sEC-SNMF. We use the results of community partition at the previous time step as the priori information to modify the current network topology, then smooth-out the evolution of the communities and reduce the impact of noise. Furthermore, we introduce a community transition probability matrix to track and analyze the temporal evolutions. Different from previous algorithms, our approach does not need to know the number of communities in advance and can deal with the situation in which the number of communities and nodes varies over time. Extensive experiments on synthetic datasets demonstrate that the proposed method is competitive and has a superior performance.
Acceleration Using Upper and Lower Smoothing Filters for Generating Oil-Film-Like Images
Toru HIRAOKA Kiichi URAHAMA

LETTER-Computer Graphics

Pubricized:
2019/09/10
Vol:
E102-D No:12
Page(s):
2642-2645
A non-photorealistic rendering method has been proposed for generating oil-film-like images from photographic images by bilateral infra-envelope filter. The conventional method has a disadvantage that it takes much time to process. We propose a method for generating oil-film-like images that can be processed faster than the conventional method. The proposed method uses an iterative process with upper and lower smoothing filters. To verify the effectiveness of the proposed method, we conduct experiments using Lenna image. As a result of the experiments, we show that the proposed method can process faster than the conventional method.
Channel and Frequency Attention Module for Diverse Animal Sound Classification
Kyungdeuk KO Jaihyun PARK David K. HAN Hanseok KO

LETTER-Artificial Intelligence, Data Mining

Pubricized:
2019/09/17
Vol:
E102-D No:12
Page(s):
2615-2618
In-class species classification based on animal sounds is a highly challenging task even with the latest deep learning technique applied. The difficulty of distinguishing the species is further compounded when the number of species is large within the same class. This paper presents a novel approach for fine categorization of animal species based on their sounds by using pre-trained CNNs and a new self-attention module well-suited for acoustic signals The proposed method is shown effective as it achieves average species accuracy of 98.37% and the minimum species accuracy of 94.38%, the highest among the competing baselines, which include CNN's without self-attention and CNN's with CBAM, FAM, and CFAM but without pre-training.
Two-Layer Near-Lossless HDR Coding Using Zero-Skip Quantization with Backward Compatibility to JPEG
Hiroyuki KOBAYASHI Osamu WATANABE Hitoshi KIYA

PAPER-Image

Vol:
E102-A No:12
Page(s):
1842-1848
We propose an efficient two-layer near-lossless coding method using an extended histogram packing technique with backward compatibility to the legacy JPEG standard. The JPEG XT, which is the international standard to compress HDR images, adopts a two-layer coding method for backward compatibility to the legacy JPEG standard. However, there are two problems with this two-layer coding method. One is that it does not exhibit better near-lossless performance than other methods for HDR image compression with single-layer structure. The other problem is that the determining the appropriate values of the coding parameters may be required for each input image to achieve good compression performance of near-lossless compression with the two-layer coding method of the JPEG XT. To solve these problems, we focus on a histogram-packing technique that takes into account the histogram sparseness of HDR images. We used zero-skip quantization, which is an extension of the histogram-packing technique proposed for lossless coding, for implementing the proposed near-lossless coding method. The experimental results indicate that the proposed method exhibits not only a better near-lossless compression performance than that of the two-layer coding method of the JPEG XT, but also there are no issue regarding the combination of parameter values without losing backward compatibility to the JPEG standard.
Fast Serial Iterative Decoding Algorithm for Zigzag Decodable Fountain Codes by Efficient Scheduling
Yoshihiro MURAYAMA Takayuki NOZAKI

PAPER-Erasure Correction

Vol:
E102-A No:12
Page(s):
1600-1610
Fountain codes are erasure correcting codes realizing reliable communication systems for the multicast on the Internet. The zigzag decodable fountain (ZDF) codes are one of generalization of the Raptor codes, i.e., applying shift operation to generate the output packets. The ZDF codes are decoded by a two-stage iterative decoding algorithm, which combines the packet-wise peeling algorithm and the bit-wise peeling algorithm. By the bit-wise peeling algorithm and shift operation, ZDF codes outperform Raptor codes under iterative decoding in terms of decoding erasure rates and overheads. However, the bit-wise peeling algorithm spends long decoding time. This paper proposes fast bit-wise decoding algorithms for the ZDF codes. Simulation results show that the proposed algorithm drastically reduces the decoding time compared with the previous algorithm.
Density Optimization for Analog Layout Based on Transistor-Array
Chao GENG Bo LIU Shigetoshi NAKATAKE

PAPER

Vol:
E102-A No:12
Page(s):
1720-1730
In integrated circuit design of advanced technology nodes, layout density uniformity significantly influences the manufacturability due to the CMP variability. In analog design, especially, designers are suffering from passing the density checking since there are few useful tools. To tackle this issue, we focus a transistor-array(TA)-style analog layout, and propose a density optimization algorithm consistent with complicated design rules. Based on TA-style, we introduce a density-aware layout format to explicitly control the layout pattern density, and provide the mathematical optimization approach. Hence, a design flow incorporating our density optimization can drastically reduce the design time with fewer iterations. In a design case of an OPAMP layout in a 65nm CMOS process, the result demonstrates that the proposed approach achieves more than 48× speed-up compared with conventional manual layout, meanwhile it shows a good circuit performance in the post-layout simulation.
Selective Use of Stitch-Induced Via for V0 Mask Reduction: Standard Cell Design and Placement Optimization
Daijoon HYUN Younggwang JUNG Youngsoo SHIN

PAPER

Vol:
E102-A No:12
Page(s):
1711-1719
Multiple patterning lithography allows fine patterns beyond lithography limit, but it suffers from a large process cost. In this paper, we address a method to reduce the number of V0 masks; it consists of two sub-problems. First, stitch-induced via (SIV) is introduced to reduce the number of V0 masks. It involves the redesign of standard cells to replace some vias in V0 layer with SIVs, such that the remaining vias can be assigned to the reduced masks. Since SIV formation requires metal stitches in different masks, SIV replacement and metal mask assignment should be solved simultaneously. This sub-problem is formulated as integer linear programming (ILP). In the second sub-problem, inter-row via conflict aware detailed placement is addressed. Single row placement optimization is performed for each row to remove metal and inter-row via conflicts, while minimizing cell displacements. Since it is time consuming to consider many cell operations at once, we apply a few operations iteratively, where different operations are applied to each iteration and to each cell depending on whether the cell has a conflict in the previous iteration. Remaining conflicts are then removed by mapping conflict cells to white spaces. To this end, we minimize the number of cells to move and maximize the number of large white spaces before mapping. Experimental results demonstrate that the cell placement with two V0 masks is completed by proposed methods, with 7 times speedup and 21% reduction in total cell displacement, compared to conventional detailed placement.
FOREWORD Open Access
Michihiro KOIBUCHI

FOREWORD

Vol:
E102-D No:12
Page(s):
2280-2280
Rhythm Tap Technique for Cross-Device Interaction Enabling Uniform Operation for Various Devices Open Access
Hirohito SHIBATA Junko ICHINO Shun'ichi TANO Tomonori HASHIYAMA

PAPER-Human-computer Interaction

Pubricized:
2019/09/19
Vol:
E102-D No:12
Page(s):
2515-2523
This paper proposes a novel interaction technique to transfer data across various types of digital devices in uniform a manner and to allow specifying what kind of data should be sent. In our framework, when users tap multiple devices rhythmically, data corresponding to the rhythm (transfer type) are transferred from a device tapped in the first tap (source device) to the other (target device). It is easy to operate, applicable to a wide range of devices, and extensible in a sense that we can adopt new transfer types by adding new rhythms. Through a subjective evaluation and a simulation, we had a prospect that our approach would be feasible. We also discuss suggestions and limitation to implement the technique.
Image Regularization with Total Variation and Optimized Morphological Gradient Priors
Shoya OOHARA Mitsuji MUNEYASU Soh YOSHIDA Makoto NAKASHIZUKA

LETTER-Image

Vol:
E102-A No:12
Page(s):
1920-1924
For image restoration, an image prior that is obtained from the morphological gradient has been proposed. In the field of mathematical morphology, the optimization of the structuring element (SE) used for this morphological gradient using a genetic algorithm (GA) has also been proposed. In this paper, we introduce a new image prior that is the sum of the morphological gradients and total variation for an image restoration problem to improve the restoration accuracy. The proposed image prior makes it possible to almost match the fitness to a quantitative evaluation such as the mean square error. It also solves the problem of the artifact due to the unsuitability of the SE for the image. An experiment shows the effectiveness of the proposed image restoration method.
Maximizing Lifetime of Data-Gathering Sensor Trees in Wireless Sensor Networks
Hiroshi MATSUURA

PAPER-Network

Pubricized:
2019/06/10
Vol:
E102-B No:12
Page(s):
2205-2217
Sensor-data gathering using multi-hop connections in a wireless sensor network is being widely used, and a tree topology for data gathering is considered promising because it eases data aggregation. Therefore, many sensor-tree-creation algorithms have been proposed. The sensors in a tree, however, generally run on batteries, so long tree lifetime is one of the most important factors in collecting sensor data from a tree over a long period. It has been proven that creating the longest-lifetime tree is a non-deterministic-polynomial complete problem; thus, all previously proposed sensor-tree-creation algorithms are heuristic. To evaluate a heuristic algorithm, the time complexity of the algorithm is very important, as well as the quantitative evaluation of the lifetimes of the created trees and algorithm speed. This paper proposes an algorithm called assured switching with accurate graph optimization (ASAGAO) that can create a sensor tree with a much longer lifetime much faster than other sensor-tree-creation algorithms. In addition, it has much smaller time complexity.
Hadamard-Type Matrices on Finite Fields and Complete Complementary Codes
Tetsuya KOJIMA

PAPER-Sequences

Vol:
E102-A No:12
Page(s):
1651-1658
Hadamard matrix is defined as a square matrix where any components are -1 or +1, and where any pairs of rows are mutually orthogonal. In this work, we consider the similar matrix on finite field GF(p) where p is an odd prime. In such a matrix, every component is one of the integers on GF(p){0}, that is, {1,2,...,p-1}. Any additions and multiplications should be executed under modulo p. In this paper, a method to generate such matrices is proposed. In addition, the paper includes the applications to generate n-shift orthogonal sequences and complete complementary codes. The generated complete complementary code is a family of multi-valued sequences on GF(p){0}, where the number of sequence sets, the number of sequences in each sequence set and the sequence length depend on the various divisors of p-1. Such complete complementary codes with various parameters have not been proposed in previous studies.
A New Combiner for Key Encapsulation Mechanisms
Goichiro HANAOKA Takahiro MATSUDA Jacob C. N. SCHULDT

PAPER-Cryptography

Vol:
E102-A No:12
Page(s):
1668-1675
Key encapsulation mechanism (KEM) combiners, recently formalized by Giacon, Heuer, and Poettering (PKC'18), enable hedging against insecure KEMs or weak parameter choices by combining ingredient KEMs into a single KEM that remains secure assuming just one of the underlying ingredient KEMs is secure. This seems particularly relevant when considering quantum-resistant KEMs which are often based on arguably less well-understood hardness assumptions and parameter choices. We propose a new simple KEM combiner based on a one-time secure message authentication code (MAC) and two-time correlated input secure hash. Instantiating the correlated input secure hash with a t-wise independent hash for an appropriate value of t, yields a KEM combiner based on a strictly weaker additional primitive than the standard model construction of Giaon et al. and furthermore removes the need to do n full passes over the encapsulation, where n is the number of ingredient KEMs, which Giacon et al. highlight as a disadvantage of their scheme. However, unlike Giacon et al., our construction requires the public key of the combined KEM to include a hash key, and furthermore requires a MAC tag to be added to the encapsulation of the combined KEM.
On the Distribution of p-Error Linear Complexity of p-Ary Sequences with Period pⁿ
Miao TANG Juxiang WANG Minjia SHI Jing LIANG

LETTER-Fundamentals of Information Systems

Pubricized:
2019/09/02
Vol:
E102-D No:12
Page(s):
2595-2598
Linear complexity and the k-error linear complexity of periodic sequences are the important security indices of stream cipher systems. This paper focuses on the distribution of p-error linear complexity of p-ary sequences with period pn. For p-ary sequences of period pn with linear complexity pn-p+1, n≥1, we present all possible values of the p-error linear complexity, and derive the exact formulas to count the number of the sequences with any given p-error linear complexity.
On the Performance Analysis of SPHINCS⁺ Verification
Tae Gu KANG Jinwoo LEE Junyeng KIM Dae Hyun YUM

LETTER-Information Network

Pubricized:
2019/09/20
Vol:
E102-D No:12
Page(s):
2603-2606
SPHINCS+, an updated version of SPHINCS, is a post-quantum hash-based signature scheme submitted to the NIST post-quantum cryptography standardization project. To evaluate its performance, SPHINCS+ gives the theoretical number of function calls and the actual runtime of a reference implementation. We show that the theoretical number of function calls for SPHINCS+ verification is inconsistent with the runtime and then present the correct number of function calls.
Implementation and Area Optimization of LUT6 Based Convolution Structure on FPGA
Huangtao WU Wenjin HUANG Rui CHEN Yihua HUANG

LETTER

Vol:
E102-A No:12
Page(s):
1813-1815
To implement the parallel acceleration of convolution operation of Convolutional Neural Networks (CNNs) on field programmable gate array (FPGA), large quantities of the logic resources will be consumed, expecially DSP cores. Many previous researches fail to make a well balance between DSP and LUT6. For better resource efficiency, a typical convolution structure is implemented with LUT6s in this paper. Besides, a novel convolution structure is proposed to further reduce the LUT6 resource consumption by modifying the typical convolution structure. The equations to evaluate the LUT6 resource consumptions of both structures are presented and validated. The theoretical evaluation and experimental results show that the novel structure can save 3.5-8% of LUT6s compared with the typical structure.

3601-3620hit(42807hit)

Keyword Search Result

[Keyword] (42807hit)

Latent Words Recurrent Neural Network Language Models for Automatic Speech Recognition

Attentive Sequences Recurrent Network for Social Relation Recognition from Video Open Access

Attention-Guided Spatial Transformer Networks for Fine-Grained Visual Recognition

A Stackelberg Game-Theoretic Solution to Win-Win Situation: A Presale Mechanism in Spectrum Market

An Evolutionary Approach Based on Symmetric Nonnegative Matrix Factorization for Community Detection in Dynamic Networks

Acceleration Using Upper and Lower Smoothing Filters for Generating Oil-Film-Like Images

Channel and Frequency Attention Module for Diverse Animal Sound Classification

Two-Layer Near-Lossless HDR Coding Using Zero-Skip Quantization with Backward Compatibility to JPEG

Fast Serial Iterative Decoding Algorithm for Zigzag Decodable Fountain Codes by Efficient Scheduling

Density Optimization for Analog Layout Based on Transistor-Array

Selective Use of Stitch-Induced Via for V0 Mask Reduction: Standard Cell Design and Placement Optimization

FOREWORD Open Access

Rhythm Tap Technique for Cross-Device Interaction Enabling Uniform Operation for Various Devices Open Access

Image Regularization with Total Variation and Optimized Morphological Gradient Priors

Maximizing Lifetime of Data-Gathering Sensor Trees in Wireless Sensor Networks

Hadamard-Type Matrices on Finite Fields and Complete Complementary Codes

A New Combiner for Key Encapsulation Mechanisms

On the Distribution of p-Error Linear Complexity of p-Ary Sequences with Period pⁿ

On the Performance Analysis of SPHINCS⁺ Verification

Implementation and Area Optimization of LUT6 Based Convolution Structure on FPGA

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles