The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] SI(16314hit)

1441-1460hit(16314hit)

  • Multi-Scale Chroma n-Gram Indexing for Cover Song Identification

    Jin S. SEO  

     
    LETTER

      Pubricized:
    2019/10/23
      Vol:
    E103-D No:1
      Page(s):
    59-62

    To enhance cover song identification accuracy on a large-size music archive, a song-level feature summarization method is proposed by using multi-scale representation. The chroma n-grams are extracted in multiple scales to cope with both global and local tempo changes. We derive index from the extracted n-grams by clustering to reduce storage and computation for DB search. Experiments on the widely used music datasets confirmed that the proposed method achieves the state-of-the-art accuracy while reducing cost for cover song search.

  • Measuring Semantic Similarity between Words Based on Multiple Relational Information

    Jianyong DUAN  Yuwei WU  Mingli WU  Hao WANG  

     
    PAPER-Natural Language Processing

      Pubricized:
    2019/09/27
      Vol:
    E103-D No:1
      Page(s):
    163-169

    The similarity of words extracted from the rich text relation network is the main way to calculate the semantic similarity. Complex relational information and text content in Wikipedia website, Community Question Answering and social network, provide abundant corpus for semantic similarity calculation. However, most typical research only focused on single relationship. In this paper, we propose a semantic similarity calculation model which integrates multiple relational information, and map multiple relationship to the same semantic space through learning representing matrix and semantic matrix to improve the accuracy of semantic similarity calculation. In experiments, we confirm that the semantic calculation method which integrates many kinds of relationships can improve the accuracy of semantic calculation, compared with other semantic calculation methods.

  • Image Identification of Encrypted JPEG Images for Privacy-Preserving Photo Sharing Services

    Kenta IIDA  Hitoshi KIYA  

     
    PAPER

      Pubricized:
    2019/10/25
      Vol:
    E103-D No:1
      Page(s):
    25-32

    We propose an image identification scheme for double-compressed encrypted JPEG images that aims to identify encrypted JPEG images that are generated from an original JPEG image. To store images without any visual sensitive information on photo sharing services, encrypted JPEG images are generated by using a block-scrambling-based encryption method that has been proposed for Encryption-then-Compression systems with JPEG compression. In addition, feature vectors robust against JPEG compression are extracted from encrypted JPEG images. The use of the image encryption and feature vectors allows us to identify encrypted images recompressed multiple times. Moreover, the proposed scheme is designed to identify images re-encrypted with different keys. The results of a simulation show that the identification performance of the scheme is high even when images are recompressed and re-encrypted.

  • On the Complementary Role of DNN Multi-Level Enhancement for Noisy Robust Speaker Recognition in an I-Vector Framework

    Xingyu ZHANG  Xia ZOU  Meng SUN  Penglong WU  Yimin WANG  Jun HE  

     
    LETTER-Speech and Hearing

      Vol:
    E103-A No:1
      Page(s):
    356-360

    In order to improve the noise robustness of automatic speaker recognition, many techniques on speech/feature enhancement have been explored by using deep neural networks (DNN). In this work, a DNN multi-level enhancement (DNN-ME), which consists of the stages of signal enhancement, cepstrum enhancement and i-vector enhancement, is proposed for text-independent speaker recognition. Given the fact that these enhancement methods are applied in different stages of the speaker recognition pipeline, it is worth exploring the complementary role of these methods, which benefits the understanding of the pros and cons of the enhancements of different stages. In order to use the capabilities of DNN-ME as much as possible, two kinds of methods called Cascaded DNN-ME and joint input of DNNs are studied. Weighted Gaussian mixture models (WGMMs) proposed in our previous work is also applied to further improve the model's performance. Experiments conducted on the Speakers in the Wild (SITW) database have shown that DNN-ME demonstrated significant superiority over the systems with only a single enhancement for noise robust speaker recognition. Compared with the i-vector baseline, the equal error rate (EER) was reduced from 5.75 to 4.01.

  • IoT Malware Analysis and New Pattern Discovery Through Sequence Analysis Using Meta-Feature Information

    Chun-Jung WU  Shin-Ying HUANG  Katsunari YOSHIOKA  Tsutomu MATSUMOTO  

     
    PAPER-Fundamental Theories for Communications

      Pubricized:
    2019/08/05
      Vol:
    E103-B No:1
      Page(s):
    32-42

    A drastic increase in cyberattacks targeting Internet of Things (IoT) devices using telnet protocols has been observed. IoT malware continues to evolve, and the diversity of OS and environments increases the difficulty of executing malware samples in an observation setting. To address this problem, we sought to develop an alternative means of investigation by using the telnet logs of IoT honeypots and analyzing malware without executing it. In this paper, we present a malware classification method based on malware binaries, command sequences, and meta-features. We employ both unsupervised or supervised learning algorithms and text-mining algorithms for handling unstructured data. Clustering analysis is applied for finding malware family members and revealing their inherent features for better explanation. First, the malware binaries are grouped using similarity analysis. Then, we extract key patterns of interaction behavior using an N-gram model. We also train a multiclass classifier to identify IoT malware categories based on common infection behavior. For misclassified subclasses, second-stage sub-training is performed using a file meta-feature. Our results demonstrate 96.70% accuracy, with high precision and recall. The clustering results reveal variant attack vectors and one denial of service (DoS) attack that used pure Linux commands.

  • Non-Blind Speech Watermarking Method Based on Spread-Spectrum Using Linear Prediction Residue

    Reiya NAMIKAWA  Masashi UNOKI  

     
    LETTER

      Pubricized:
    2019/10/23
      Vol:
    E103-D No:1
      Page(s):
    63-66

    We propose a method of non-blind speech watermarking based on direct spread spectrum (DSS) using a linear prediction scheme to solve sound distortion due to spread spectrum. Results of evaluation simulations revealed that the proposed method had much lower sound-quality distortion than the DSS method while having almost the same bit error ratios (BERs) against various attacks as the DSS method.

  • Energy-Efficient Full-Duplex Enabled Cloud Radio Access Networks

    Tung Thanh VU  Duy Trong NGO  Minh N. DAO  Quang-Thang DUONG  Minoru OKADA  Hung NGUYEN-LE  Richard H. MIDDLETON  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2019/07/18
      Vol:
    E103-B No:1
      Page(s):
    71-78

    This paper studies the joint optimization of precoding, transmit power and data rate allocation for energy-efficient full-duplex (FD) cloud radio access networks (C-RANs). A new nonconvex problem is formulated, where the ratio of total sum rate to total power consumption is maximized, subject to the maximum transmit powers of remote radio heads and uplink users. An iterative algorithm based on successive convex programming is proposed with guaranteed convergence to the Karush-Kuhn-Tucker solutions of the formulated problem. Numerical examples confirm the effectiveness of the proposed algorithm and show that the FD C-RANs can achieve a large gain over half-duplex C-RANs in terms of energy efficiency at low self-interference power levels.

  • Fully Homomorphic Encryption Scheme Based on Decomposition Ring Open Access

    Seiko ARITA  Sari HANDA  

     
    PAPER

      Vol:
    E103-A No:1
      Page(s):
    195-211

    In this paper, we propose the decomposition ring homomorphic encryption scheme, that is a homomorphic encryption scheme built on the decomposition ring, which is a subring of cyclotomic ring. By using the decomposition ring the structure of plaintext slot becomes ℤpl, instead of GF(pd) in conventional schemes on the cyclotomic ring. For homomorphic multiplication of integers, one can use the full of ℤpl slots using the proposed scheme, although in conventional schemes one can use only one-dimensional subspace GF(p) in each GF(pd) slot. This allows us to realize fast and compact homomorphic encryption for integer plaintexts. In fact, our benchmark results indicate that our decomposition ring homomorphic encryption schemes are several times faster than HElib for integer plaintexts due to its higher parallel computation.

  • Free Space Optical Turbo Coded Communication System with Hybrid PPM-OOK Signaling

    Ran SUN  Hiromasa HABUCHI  Yusuke KOZAWA  

     
    PAPER

      Vol:
    E103-A No:1
      Page(s):
    287-294

    For high transmission efficiency, good modulation schemes are expected. This paper focuses on the enhancement of the modulation scheme of free space optical turbo coded system. A free space optical turbo coded system using a new signaling scheme called hybrid PPM-OOK signaling (HPOS) is proposed and investigated. The theoretical formula of the bit error rate of the uncoded HPOS system is derived. The effective information rate performances (i.e. channel capacity) of the proposed HPOS turbo coded system are evaluated through computer simulation in free space optical channel, with weak, moderate, strong scintillation. The performance of the proposed HPOS turbo coded system is compared with those of the conventional OOK (On-Off Keying) turbo coded system and BPPM (Binary Pulse Position Modulation) turbo coded system. As results, the proposed HPOS turbo coded system shows the same tolerance capability to background noise and atmospheric turbulence as the conventional BPPM turbo coded system, and it has 1.5 times larger capacity.

  • An Open Multi-Sensor Fusion Toolbox for Autonomous Vehicles

    Abraham MONRROY CANO  Eijiro TAKEUCHI  Shinpei KATO  Masato EDAHIRO  

     
    PAPER

      Vol:
    E103-A No:1
      Page(s):
    252-264

    We present an accurate and easy-to-use multi-sensor fusion toolbox for autonomous vehicles. It includes a ‘target-less’ multi-LiDAR (Light Detection and Ranging), and Camera-LiDAR calibration, sensor fusion, and a fast and accurate point cloud ground classifier. Our calibration methods do not require complex setup procedures, and once the sensors are calibrated, our framework eases the fusion of multiple point clouds, and cameras. In addition we present an original real-time ground-obstacle classifier, which runs on the CPU, and is designed to be used with any type and number of LiDARs. Evaluation results on the KITTI dataset confirm that our calibration method has comparable accuracy with other state-of-the-art contenders in the benchmark.

  • Computationally Efficient DOA Estimation for Massive Uniform Linear Array

    Wei JHANG  Shiaw-Wu CHEN  Ann-Chen CHANG  

     
    LETTER-Digital Signal Processing

      Vol:
    E103-A No:1
      Page(s):
    361-365

    This letter presents an improved hybrid direction of arrival (DOA) estimation scheme with computational efficiency for massive uniform linear array. In order to enhance the resolution of DOA estimation, the initial estimator based on the discrete Fourier transform is applied to obtain coarse DOA estimates by a virtual array extension for one snapshot. Then, by means of a first-order Taylor series approximation to the direction vector with the one initially estimated in a very small region, the iterative fine estimator can find a new direction vector which raises the searching efficiency. Simulation results are provided to demonstrate the effectiveness of the proposed scheme.

  • Low-Complexity Time-Invariant Angle-Range Dependent DM Based on Time-Modulated FDA Using Vector Synthesis Method

    Qian CHENG  Jiang ZHU  Tao XIE  Junshan LUO  Zuohong XU  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2019/07/18
      Vol:
    E103-B No:1
      Page(s):
    79-90

    A low-complexity time-invariant angle-range dependent directional modulation (DM) based on time-modulated frequency diverse array (TM-FDA-DM) is proposed to achieve point-to-point physical layer security communications. The principle of TM-FDA is elaborated and the vector synthesis method is utilized to realize the proposal, TM-FDA-DM, where normalization and orthogonal matrices are designed to modulate the useful baseband symbols and inserted artificial noise, respectively. Since the two designed matrices are time-invariant fixed values, which avoid real-time calculation, the proposed TM-FDA-DM is much easier to implement than time-invariant DMs based on conventional linear FDA or logarithmical FDA, and it also outperforms the time-invariant angle-range dependent DM that utilizes genetic algorithm (GA) to optimize phase shifters on radio frequency (RF) frontend. Additionally, a robust synthesis method for TM-FDA-DM with imperfect angle and range estimations is proposed by optimizing normalization matrix. Simulations demonstrate that the proposed TM-FDA-DM exhibits time-invariant and angle-range dependent characteristics, and the proposed robust TM-FDA-DM can achieve better BER performance than the non-robust method when the maximum range error is larger than 7km and the maximum angle error is larger than 4°.

  • Spectra Restoration of Bone-Conducted Speech via Attention-Based Contextual Information and Spectro-Temporal Structure Constraint Open Access

    Changyan ZHENG  Tieyong CAO  Jibin YANG  Xiongwei ZHANG  Meng SUN  

     
    LETTER-Digital Signal Processing

      Vol:
    E102-A No:12
      Page(s):
    2001-2007

    Compared with acoustic microphone (AM) speech, bone-conducted microphone (BCM) speech is much immune to background noise, but suffers from severe loss of information due to the characteristics of the human-body transmission channel. In this letter, a new method for the speaker-dependent BCM speech enhancement is proposed, in which we focus our attention on the spectra restoration of the distorted speech. In order to better infer the missing components, an attention-based bidirectional Long Short-Term Memory (AB-BLSTM) is designed to optimize the use of contextual information to model the relationship between the spectra of BCM speech and its corresponding clean AM speech. Meanwhile, a structural error metric, Structural SIMilarity (SSIM) metric, originated from image processing is proposed to be the loss function, which provides the constraint of the spectro-temporal structures in recovering of the spectra. Experiments demonstrate that compared with approaches based on conventional DNN and mean square error (MSE), the proposed method can better recover the missing phonemes and obtain spectra with spectro-temporal structure more similar to the target one, which leads to great improvement on objective metrics.

  • Signal Selection Methods for Debugging Gate-Level Sequential Circuits

    Yusuke KIMURA  Amir Masoud GHAREHBAGHI  Masahiro FUJITA  

     
    PAPER

      Vol:
    E102-A No:12
      Page(s):
    1770-1780

    This paper introduces methods to modify a buggy sequential gate-level circuit to conform to the specification. In order to preserve the optimization efforts, the modifications should be as small as possible. Assuming that the locations to be modified are given, our proposed method finds an appropriate set of fan-in signals for the patch function of those locations by iteratively calculating the state correspondence between the specification and the buggy circuit and applying a method for debugging combinational circuits. The experiments are conducted on ITC99 benchmark circuits, and it is shown that our proposed method can work when there are at most 30,000 corresponding reachable state pairs between two circuits. Moreover, a heuristic method using the information of data-path FFs is proposed, which can find a correct set of fan-ins for all the benchmark circuits within practical time.

  • 16-QAM Sequences with Good Periodic Autocorrelation Function

    Fanxin ZENG  Yue ZENG  Lisheng ZHANG  Xiping HE  Guixin XUAN  Zhenyu ZHANG  Yanni PENG  Linjie QIAN  Li YAN  

     
    LETTER-Sequences

      Vol:
    E102-A No:12
      Page(s):
    1697-1700

    Sequences that attain the smallest possible absolute sidelobes (SPASs) of periodic autocorrelation function (PACF) play fairly important roles in synchronization of communication systems, Large scale integrated circuit testing, and so on. This letter presents an approach to construct 16-QAM sequences of even periods, based on the known quaternary sequences. A relationship between the PACFs of 16-QAM and quaternary sequences is established, by which when quaternary sequences that attain the SPASs of PACF are employed, the proposed 16-QAM sequences have good PACF.

  • A Deep Neural Network for Real-Time Driver Drowsiness Detection

    Toan H. VU  An DANG  Jia-Ching WANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2019/09/25
      Vol:
    E102-D No:12
      Page(s):
    2637-2641

    We develop a deep neural network (DNN) for detecting driver drowsiness in videos. The proposed DNN model that receives driver's faces extracted from video frames as inputs consists of three components - a convolutional neural network (CNN), a convolutional control gate-based recurrent neural network (ConvCGRNN), and a voting layer. The CNN is to learn facial representations from global faces which are then fed to the ConvCGRNN to learn their temporal dependencies. The voting layer works like an ensemble of many sub-classifiers to predict drowsiness state. Experimental results on the NTHU-DDD dataset show that our model not only achieve a competitive accuracy of 84.81% without any post-processing but it can work in real-time with a high speed of about 100 fps.

  • FPGA-Based Annealing Processor with Time-Division Multiplexing

    Kasho YAMAMOTO  Masayuki IKEBE  Tetsuya ASAI  Masato MOTOMURA  Shinya TAKAMAEDA-YAMAZAKI  

     
    PAPER-Computer System

      Pubricized:
    2019/09/20
      Vol:
    E102-D No:12
      Page(s):
    2295-2305

    An annealing processor based on the Ising model is a remarkable candidate for combinatorial optimization problems and it is superior to general von Neumann computers. CMOS-based implementations of the annealing processor are efficient and feasible based on current semiconductor technology. However, critical problems with annealing processors remain. There are few simulated spins and inflexibility in terms of implementable graph topology due to hardware constraints. A prior approach to overcoming these problems is to emulate a complicated graph on a simple and high-density spin array with so-called minor embedding, a spin duplication method based on graph theory. When a complicated graph is embedded on such hardware, numerous spins are consumed to represent high-degree spins by combining multiple low-degree spins. In addition to the number of spins, the quality of solutions decreases as a result of dummy strong connections between the duplicated spins. Thus, the approach cannot handle large-scale practical problems. This paper proposes a flexible and scalable hardware architecture with time-division multiplexing for massive spins and high-degree topologies. A target graph is separated and mapped onto multiple virtual planes, and each plane is subject to interleaved simulation with time-division processing. Therefore, the behavior of high-degree spins is efficiently emulated over time, so that no dummy strong connections are required, and the solution quality is accordingly improved. We implemented a prototype hardware design for FPGAs, and we evaluated the proposed method in a software-based annealing processor simulator. The results indicate that the method increased the spins that can be deployed. In addition, our time-division multiplexing architecture improved the solution quality and convergence time with reasonable resource consumption.

  • Dual-Band Dual-Rectangular-Loop Circular Polarization Antenna for Global Navigation Satellite System Open Access

    Makoto SUMI  Jun-ichi TAKADA  

     
    PAPER-Antennas and Propagation

      Pubricized:
    2019/06/25
      Vol:
    E102-B No:12
      Page(s):
    2243-2252

    This paper proposes a dual-band dual-rectangular-loop circular polarization antenna for Global Navigation Satellite Systems (GNSSs). The proposed antenna combines two large outer rectangular loops with two small inner loops. Each large outer loop is connected to its corresponding small inner rectangular loop. Each loop has gaps located symmetrically with respect to a feed point to produce Right Handed Circular Polarization (RHCP). The gap position and the shape of the rectangular loops are very important to adjust both the impedance matching and circular polarization characteristics. The proposed antenna offers dual-band Voltage Standing Wave Ratio (VSWR) and Axial Ratio (AR) frequency characteristics that include the L1 (1575.42 MHz) and L2 (1227.60 MHz) bands. The antenna gains exceed 8.7 dBi. Broad AR elevation patterns are obtained. These antenna characteristics are well suited to precise positioning.

  • 3D Global and Multi-View Local Features Combination Based Qualitative Action Recognition for Volleyball Game Analysis

    Xina CHENG  Yang LIU  Takeshi IKENAGA  

     
    PAPER-Image

      Vol:
    E102-A No:12
      Page(s):
    1891-1899

    Volleyball video analysis plays important roles in providing data for TV contents and developing strategies. Among all the topics of volleyball analysis, qualitative player action recognition is essential because it potentially provides not only the action that being performed but also the quality, which means how well the action is performed. However, most action recognition researches focus on the discrimination between different actions. The quality of an action, which is helpful for evaluation and training of the player skill, has only received little attention so far. The vital problems in qualitative action recognition include occlusion, small inter-class difference and various kinds of appearance caused by the player change. This paper proposes a 3D global and multi-view local features combination based recognition framework with global team formation feature, ball state feature and abrupt pose features. The above problems are solved by the combination of 3D global features (which hide the unstable and incomplete 2D motion feature caused by occlusion) and the multi-view local features (which get detailed local motion features of body parts in multiple viewpoints). Firstly, the team formation extracts the 3D trajectories from the whole team members rather than a single target player. This proposal focuses more on the entire feature while eliminating the personal effect. Secondly, the ball motion state feature extracts features from the 3D ball trajectory. The ball motion is not affected by the personal appearance, so this proposal ignores the influence of the players appearance and makes it more robust to target player change. At last, the abrupt pose feature consists of two parts: the abrupt hit frame pose (which extracts the contour shape of the player's pose at the hit time) and abrupt pose variation (which extracts the pose variation between the preparation pose and ending pose during the action). These two features make difference of each action quality more distinguishable by focusing on the motion standard and stability between different quality actions. Experiments are conducted on game videos from the Semifinal and Final Game of 2014 Japan Inter High School Games of Men's Volleyball in Tokyo Metropolitan Gymnasium. The experimental results show the accuracy achieves 97.26%, improving 11.33% for action discrimination and 91.76%, and improving 13.72% for action quality evaluation.

  • Methods for Reducing Power and Area of BDD-Based Optical Logic Circuits

    Ryosuke MATSUO  Jun SHIOMI  Tohru ISHIHARA  Hidetoshi ONODERA  Akihiko SHINYA  Masaya NOTOMI  

     
    PAPER

      Vol:
    E102-A No:12
      Page(s):
    1751-1759

    Optical circuits using nanophotonic devices attract significant interest due to its ultra-high speed operation. As a consequence, the synthesis methods for the optical circuits also attract increasing attention. However, existing methods for synthesizing optical circuits mostly rely on straight-forward mappings from established data structures such as Binary Decision Diagram (BDD). The strategy of simply mapping a BDD to an optical circuit sometimes results in an explosion of size and involves significant power losses in branches and optical devices. To address these issues, this paper proposes a method for reducing the size of BDD-based optical logic circuits exploiting wavelength division multiplexing (WDM). The paper also proposes a method for reducing the number of branches in a BDD-based circuit, which reduces the power dissipation in laser sources. Experimental results obtained using a partial product accumulation circuit used in a 4-bit parallel multiplier demonstrates significant advantages of our method over existing approaches in terms of area and power consumption.

1441-1460hit(16314hit)