IEICE TRANSACTIONS on Information

Impact Factor

0.72
Eigenfactor

0.002
article influence

0.1
Cite Score

1.4

To the Advance publication
To the Archives

Advance publication (published online immediately after acceptance)

Vision Transformer with Key-select Routing Attention for Single Image Dehazing
Lihan TONG Weijia LI Qingxia YANG Liyuan CHEN Peng CHEN

Pubricized:
2024/07/01
- Summary
- Free PDF (2MB)
Towards Superior Pruning Performance in Federated Learning with Discriminative Data
Yinan YANG

Pubricized:
2024/06/27
- Summary
- Free PDF (7.9MB)
CLEAR & RETURN: Stopping Run-time Countermeasures in Cryptographic Primitives
Myung-Hyun KIM Seungkwang LEE

Pubricized:
2024/06/26
- Summary
- Free PDF (1.7MB)
SH-YOLO: Small Target High Performance YOLO for abnormal behavior detection in escalator scene
Shuoyan LIU Chao LI Yuxin LIU Yanqiu WANG

Pubricized:
2024/06/26
- Summary
- Free PDF (708.4KB)
Design and implementation of opto-electrical hybrid floating-point multipliers
Takumi INABA Takatsugu ONO Koji INOUE Satoshi KAWAKAMI

Pubricized:
2024/06/26
- Summary
- Free PDF (2.5MB)
Geometric Refactoring of Quantum and Reversible Circuits using Graph Algorithms
Martin LUKAC Saadat NURSULTAN Georgiy KRYLOV Oliver KESZOCZE Abilmansur RAKHMETTULAYEV Michitaka KAMEYAMA

Pubricized:
2024/06/24
- Summary
- Free PDF (1010KB)
IAD-Net: Single-Image Dehazing Network Based on Image Attention
Zheqing ZHANG Hao ZHOU Chuan LI Weiwei JIANG

Pubricized:
2024/06/20
- Summary
- Free PDF (9.8MB)
Improving the Accuracy of Differential-Neural Distinguisher For DES, Chaskey, and PRESENT
Liu ZHANG Zilong WANG Yindong CHEN

Pubricized:
2024/06/20
- Summary
- Free PDF (355.3KB)
Multi-Scale Contrastive Learning for Human Pose Estimation
Wenxia Bao An Lin Hua Huang Xianjun Yang Hemu Chen

Pubricized:
2024/06/17
- Summary
- Free PDF (1MB)
HDR-VDA: A Full Stage Data Augmentation Method for HDR Video Reconstruction
Fengshan ZHAO Qin LIU Takeshi IKENAGA

Pubricized:
2024/06/17
- Summary
- Free PDF (1.2MB)
Evaluating Introduction of Systems by Goal Dependency Modeling
Haruhiko KAIYA Shinpei OGATA Shinpei HAYASHI

Pubricized:
2024/06/11
- Summary
- Free PDF (1.3MB)
MISpeller: Multimodal Information Enhancement for Chinese Spelling Correction
Jiakai LI Jianyong DUAN Hao WANG Li HE Qing ZHANG

Pubricized:
2024/06/07
- Summary
- Free PDF (3.3MB)
Integrating Event Elements for Chinese-Vietnamese Cross-lingual Event Retrieval
Yuxin HUANG Yuanlin YANG Enchang ZHU Yin LIANG Yantuan XIAN

Pubricized:
2024/06/04
- Summary
- Free PDF (3.9MB)
Space-efficient FPT Algorithms for Degeneracy
Naohito MATSUMOTO Kazuhiro KURITA Masashi KIYOMI

Pubricized:
2024/05/31
- Summary
- Free PDF (101.3KB)
Learning Fast Deployment for UAV-Assisted Disaster System
Na XING Lu LI Ye ZHANG Shiyi YANG

Pubricized:
2024/05/30
- Summary
- Free PDF (10.3MB)
TDEM: Table data extraction model based on cell segmentation
Zhe Wang Zhe-Ming Lu Hao Luo Yang-Ming Zheng

Pubricized:
2024/05/30
- Summary
- Free PDF (838KB)
Reliable image matching using optimal combination of color and intensity information based on relationship with surrounding objects
Rina TAGAMI Hiroki KOBAYASHI Shuichi AKIZUKI Manabu HASHIMOTO

Pubricized:
2024/05/30
- Summary
- Free PDF (4.2MB)
The Least Core of Routing Game Without Triangle Inequality
Tomohiro KOBAYASHI Tomomi MATSUI

Pubricized:
2024/05/30
- Summary
- Free PDF (232.9KB)
Enumerating floorplans with Aligned Columns
Shin-ichi NAKANO

Pubricized:
2024/05/30
- Summary
- Free PDF (365.4KB)
A Two-Phase Algorithm for Reliable and Energy-Efficient Heterogeneous Embedded Systems
Hongzhi XU Binlian ZHANG

Pubricized:
2024/05/27
- Summary
- Free PDF (902.4KB)
Smart Contract Timestamp Vulnerability Detection Based on Code Homogeneity
Weizhi WANG Lei XIA Zhuo ZHANG Xiankai MENG

Pubricized:
2024/05/27
- Summary
- Free PDF (706KB)
Neural End-to-end Speech Translation Leveraged by ASR Posterior Distribution
Yuka KO Katsuhito SUDOH Sakriani SAKTI Satoshi NAKAMURA

Pubricized:
2024/05/24
- Summary
- Free PDF (1.5MB)
Watermarking Method with Scaling Rate Estimation Using Pilot Signal
Rinka KAWANO Masaki KAWAMURA

Pubricized:
2024/05/22
- Summary
- Free PDF (1MB)
Type-enhanced Ensemble Triple Representation via Triple-aware Attention for Cross-lingual Entity Alignment
Zhishuo ZHANG Chengxiang TAN Xueyan ZHAO Min YANG

Pubricized:
2024/05/22
- Summary
- Free PDF (5.1MB)
Joint Optimization of Task Offloading and Resource Allocation for UAV-Assisted Edge Computing: A Stackelberg Bilayer Game Approach
Peng WANG Guifen CHEN Zhiyao SUN

Pubricized:
2024/05/21
- Summary
- Free PDF (678.2KB)
EfficientNet Empowered by Dendritic Learning for Diabetic Retinopathy
Zeyuan JU Zhipeng LIU Yu GAO Haotian LI Qianhang DU Kota YOSHIKAWA Shangce GAO

Pubricized:
2024/05/20
- Summary
- Free PDF (517.4KB)
6T-8T hybrid SRAM for lower-power neural-network processing by lowering operating voltage
Ji WU Ruoxi YU Kazuteru NAMBA

Pubricized:
2024/05/20
- Summary
- Free PDF (495.1KB)
Chinese Spelling Correction Based on Knowledge Enhancement and Contrastive Learning
Hao WANG Yao Ma Jianyong Duan Li HE Xin Li

Pubricized:
2024/05/17
- Summary
- Free PDF (1.2MB)
TIG: A Multitask Temporal Interval Guided Framework for Key Frame Detection
Shijie WANG Xuejiao HU Sheng LIU Ming LI Yang LI Sidan DU

Pubricized:
2024/05/17
- Summary
- Free PDF (10.6MB)
Node-to-node and Node-to-set Disjoint Paths Problems in Bicubes
Arata KANEKO Htoo Htoo Sandi KYAW Kunihiro FUJIYOSHI Keiichi KANEKO

Pubricized:
2024/05/17
- Summary
- Free PDF (1MB)
Remote Sensing Image Dehazing Using Multi-Scale Gated Attention For Flight Simulator
Qi LIU Bo WANG Shihan TAN Shurong ZOU Wenyi GE

Pubricized:
2024/05/14
- Summary
- Free PDF (4.2MB)
Large Class Detection using GNNs: A graph based deep learning approach utilizing three typical GNN model architectures
HanYu Zhang Tomoji Kishi

Pubricized:
2024/05/14
- Summary
- Free PDF (1.4MB)
Functional Decomposition of Symmetric Multiple-Valued Functions and Their Compact Representation in Decision Diagrams
Shinobu NAGAYAMA Tsutomu SASAO Jon T. BUTLER

Pubricized:
2024/05/14
- Summary
- Free PDF (303.5KB)
Greedy selection of sensors for linear Bayesian estimation under correlated noise
Yoon Hak KIM

Pubricized:
2024/05/14
- Summary
- Free PDF (115KB)
New Bounds for Quick Computation of the Lower Bound on the Gate Count of Toffoli-Based Reversible Logic Circuits
Takashi HIRAYAMA Rin SUZUKI Katsuhisa YAMANAKA Yasuaki NISHITANI

Pubricized:
2024/05/10
- Summary
- Free PDF (222.1KB)
Evaluation of Multi-valued Data Transmission in Two-Dimensional Symbol Mapping using Linear Mixture Model
Yosuke IIJIMA Atsunori OKADA Yasushi YUMINAKA

Pubricized:
2024/05/09
- Summary
- Free PDF (8.2MB)
Using Genetic Algorithm and Mathematical Programming Model for Ambulance Location Problem in Emergency Medical Service
Batnasan Luvaanjalba Elaine Yi-Ling Wu

Pubricized:
2024/05/08
- Summary
- Free PDF (909.3KB)
Enhanced Data Transfer Cooperating with Artificial Triplets for Scene Graph Generation
KuanChao CHU Satoshi YAMAZAKI Hideki NAKAYAMA

Pubricized:
2024/04/30
- Summary
- Free PDF (9.2MB)
A mmWave sensor and camera fusion system for indoor occupancy detection and tracking
Shenglei LI Haoran LUO Tengfei SHAO Reiko HISHIYAMA

Pubricized:
2024/04/26
- Summary
- Free PDF (4.1MB)
Evaluating PAM-4 Data Transmission Quality using Multi-Dimensional Mapping of Received Symbols
Yasushi YUMINAKA Kazuharu NAKAJIMA Yosuke IIJIMA

Pubricized:
2024/04/25
- Summary
- Free PDF (7.5MB)
Unsupervised Intrusion Detection Based on Asymmetric Auto-Encoder Feature Extraction
Chunbo Liu Liyin Wang Zhikai Zhang Chunmiao Xiang Zhaojun Gu Zhi Wang Shuang Wang

Pubricized:
2024/04/25
- Summary
- Free PDF (1.2MB)
Reinforced Voxel-RCNN:An efficient 3D Object Detection Method Based on Feature Aggregation
Jia-ji JIANG Hai-bin WAN Hong-min SUN Tuan-fa QIN Zheng-qiang WANG

Pubricized:
2024/04/24
- Summary
- Free PDF (4.6MB)
A Channel Contrastive Attention-based Local-Nonlocal Mutual block on Super-Resolution
Yuhao LIU Zhenzhong CHU Lifei WEI

Pubricized:
2024/04/23
- Summary
- Free PDF (1.5MB)
Error-Tolerance-Aware Write-Energy Reduction of MTJ-Based Quantized Neural Network Hardware
Ken ASANO Masanori NATSUI Takahiro HANYU

Pubricized:
2024/04/22
- Summary
- Free PDF (2.1MB)
Skin diagnostic method using Fontana-Masson stained images of stratum corneum cells
Shuto HASEGAWA Koichiro ENOMOTO Taeko MIZUTANI Yuri OKANO Takenori TANAKA Osamu SAKAI

Pubricized:
2024/04/19
- Summary
- Free PDF (6.6MB)
Confidence-Driven Contrastive Learning for Document Classification without Annotated Data
Zhewei XU Mizuho IWAIHARA

Pubricized:
2024/04/19
- Summary
- Free PDF (2.2MB)
Delta-Sigma Domain Signal Processing Revisited with Related Topics in Stochastic Computing
Takao WAHO Akihisa KOYAMA Hitoshi HAYASHI

Pubricized:
2024/04/17
- Summary
- Free PDF (1.7MB)
Extending Binary Neural Networks to Bayesian Neural Networks with Probabilistic Interpretation of Binary Weights
Taisei SAITO Kota ANDO Tetsuya ASAI

Pubricized:
2024/04/17
- Summary
- Free PDF (1.1MB)
Unveiling Python Version Compatibility Challenges in Code Snippets on Stack Overflow
Shiyu YANG Tetsuya KANDA Daniel M. GERMAN Yoshiki HIGO

Pubricized:
2024/04/16
- Summary
- Free PDF (488.3KB)
On Easily Reconstructable Logic Functions
Tsutomu SASAO

Pubricized:
2024/04/16
- Summary
- Free PDF (240.7KB)
Tracking WebVR User Activities through Hand Motions: An Attack Perspective
Jiyeon LEE

Pubricized:
2024/04/16
- Summary
- Free PDF (3MB)
Permissionless Blockchain-Based Sybil-Resistant Self-Sovereign Identity Utilizing Attested Execution Secure Processors
Koichi MORIYAMA Akira OTSUKA

Pubricized:
2024/04/15
- Summary
- Free PDF (4.1MB)
Cross-Corpus Speech Emotion Recognition Based on Causal Emotion Information Representation
Hongliang FU Qianqian LI Huawei TAO Chunhua ZHU Yue XIE Ruxue GUO

Pubricized:
2024/04/12
- Summary
- Free PDF (8.1MB)
Investigating and Enhancing the Neural Distinguisher for Differential Cryptanalysis
Gao WANG Gaoli WANG Siwei SUN

Pubricized:
2024/04/12
- Summary
- Free PDF (1.7MB)
Nuclear Norm Minus Frobenius Norm Minimization with Rank Residual Constraint for Image Denoising
Hua HUANG Yiwen SHAN Chuan LI Zhi WANG

Pubricized:
2024/04/09
- Summary
- Free PDF (6.9MB)
Improved Just Noticeable Difference Model Based Algorithm for Fast CU Partition in V-PCC
Zhi LIU Heng WANG Yuan LI Hongyun LU Hongyuan JING Mengmeng ZHANG

Pubricized:
2024/04/05
- Summary
- Free PDF (1.1MB)
MDX-Mixer: Music Demixing by Leveraging Source Signals Separated by Existing Demixing Models
Tomoyasu NAKANO Masataka GOTO

Pubricized:
2024/04/05
- Summary
- Free PDF (2.4MB)
Machine Learning-based System for Heat-Resistant Analysis of Car Lamp Design
Hyebong CHOI Joel SHIN Jeongho KIM Samuel YOON Hyeonmin PARK Hyejin CHO Jiyoung JUNG

Pubricized:
2024/04/03
- Summary
- Free PDF (1.4MB)
Agent Allocation-Action Learning with Dynamic Heterogeneous Graph in Multi-task Games
Xianglong LI Yuan LI Jieyuan ZHANG Xinhai XU Donghong LIU

Pubricized:
2024/04/03
- Summary
- Free PDF (3.9MB)
FSAMT : Face Shape Adaptive Makeup Transfer
Haoran LUO Tengfei SHAO Shenglei LI Reiko HISHIYAMA

Pubricized:
2024/04/02
- Summary
- Free PDF (984.6KB)
Artifact Removal Using Attention Guided Local-Global Dual-Stream Network for Sparse-View CT Reconstruction
Chang SUN Yitong LIU Hongwen YANG

Pubricized:
2024/03/29
- Summary
A CNN-based feature pyramid segmentation Strategy for acoustic；scene classification
Ji XI Yue XIE Pengxu JIANG Wei JIANG

Pubricized:
2024/03/26
- Summary
An IP Core Protection Scheme Based on Hybrid Lightweight Encryption for Neuromorphic Computing System
Ming PAN

The aritcle processing charge of this paper has not been paid.

Pubricized:
2022/09/14
- Summary

Whole issue

Volume E96-D No.12 (Publication Date:2013/12/01)

Special Section on Parallel and Distributed Computing and Networking

FOREWORD Open Access
HIDEHARU AMANO

FOREWORD

Page(s):
2513-2513
- HTML
- Free PDF (70.5KB)
Improving Cache Partitioning Algorithms for Pseudo-LRU Policies
Xi ZHANG Chuanyi LIU Zhenyu LIU Dongsheng WANG

PAPER

Page(s):
2514-2523
As the number of concurrently running applications on the chip multiprocessors (CMPs) is increasing, efficient management of the shared last-level cache (LLC) is crucial to guarantee overall performance. Recent studies have shown that cache partitioning can provide benefits in throughput, fairness and quality of service. Most prior arts apply true Least Recently Used (LRU) as the underlying cache replacement policy and rely on its stack property to work properly. However, in commodity processors, pseudo-LRU policies without stack property are commonly used instead of LRU for their simplicity and low storage overhead. Therefore, this study sets out to understand whether LRU-based cache partitioning techniques can be applied to commodity processors. In this work, we instead propose a cache partitioning mechanism for two popular pseudo-LRU policies: Not Recently Used (NRU) and Binary Tree (BT). Without the help of true LRU's stack property, we propose a profiling logic that applies curve approximation methods to derive the hit curve (hit counts under varied way allocations) for an application. We then propose a hybrid partitioning mechanism, which mitigates the gap between the predicted hit curve and the actual statistics. Simulation results demonstrate that our proposal can improve throughput by 15.3% on average and outperforms the stack-estimate proposal by 12.6% on average. Similar results can be achieved in weighted speedup. For the cache configurations under study, it requires less than 0.5% storage overhead compared to the last-level cache. In addition, we also show that profiling mechanism with only one true LRU ATD achieves comparable performance and can further reduce the hardware cost by nearly two thirds compared with the hybrid mechanism.
Battery-Aware Task Mapping for Coarse-Grained Reconfigurable Architecture
Shouyi YIN Rui SHI Leibo LIU Shaojun WEI

PAPER

Page(s):
2524-2535
Coarse-grained Reconfigurable Architecture (CGRA) is a parallel computing platform that provides both high performance of hardware and high flexibility of software. It is becoming a promising platform for embedded and mobile applications. Since the embedded and mobile devices are usually battery-powered, improving battery lifetime becomes one of the primary design issues in using CGRAs. In this paper, we propose a battery-aware task-mapping method to optimize energy consumption and improve battery lifetime. The proposed method mainly addresses two problems: task partitioning and task scheduling when mapping applications onto CGRA. The task partitioning and scheduling are formulated as a joint optimization problem of minimizing the energy consumption. The nonlinear effects of real battery are taken into account in problem formulation. Using the insights from the problem formulation, we design the task-mapping algorithm. We have used several real-world benchmarks to test the effectiveness of the proposed method. Experiment results show that our method can dramatically lower the energy consumption and prolong the battery-life.
Network Interface Architecture with Scalable Low-Latency Message Receiving Mechanism
Noboru TANABE Atsushi OHTA

PAPER

Page(s):
2536-2544
Most of scientists except computer scientists do not want to make efforts for performance tuning with rewriting their MPI applications. In addition, the number of processing elements which can be used by them is increasing year by year. On large-scale parallel systems, the number of accumulated messages on a message buffer tends to increase in some of their applications. Since searching message queue in MPI is time-consuming, system side scalable acceleration is needed for those systems. In this paper, a support function named LHS (Limited-length Head Separation) is proposed. Its performance in searching message buffer and hardware cost are evaluated. LHS accelerates searching message buffer by means of switching location to store limited-length heads of messages. It uses the effects such as increasing hit rate of cache on host with partial off-loading to hardware. Searching speed of message buffer when the order of message reception is different from the receiver's expectation is accelerated 14.3 times with LHS on FPGA-based network interface card (NIC) named DIMMnet-2. This absolute performance is 38.5 times higher than that of IBM BlueGene/P although the frequency is 8.5times slower than BlueGene/P. LHS has higher scalability than ALPU in the performance per frequency. Since these results are obtained with partially on loaded linear searching on old Pentium®4, performance gap will increase using state of art CPU. Therefore, LHS is more suitable for larger parallel systems. The discussions for adopting proposed method to state of art processors and systems are also presented.
A Fully Optical Ring Network-on-Chip with Static and Dynamic Wavelength Allocation
Ahmadou Dit Adi CISSE Michihiro KOIBUCHI Masato YOSHIMI Hidetsugu IRIE Tsutomu YOSHINAGA

PAPER

Page(s):
2545-2554
Silicon photonics Network-on-Chips (NoCs) have emerged as an attractive solution to alleviate the high power consumption of traditional electronic interconnects. In this paper, we propose a fully optical ring NoC that combines static and dynamic wavelength allocation communication mechanisms. A different wavelength-channel is statically allocated to each destination node for light weight communication. Contention of simultaneous communication requests from multiple source nodes to the destination is solved by a token based arbitration for the particular wavelength-channel. For heavy load communication, a multiwavelength-channel is available by requesting it in execution time from source node to a special node that manages dynamic allocation of the shared multiwavelength-channel among all nodes. We combine these static and dynamic communication mechanisms in a same network that introduces selection techniques based on message size and congestion information. Using a photonic NoC simulator based on Phoenixsim, we evaluate our architecture under uniform random, neighbor, and hotspot traffic patterns. Simulation results show that our proposed fully optical ring NoC presents a good performance by utilizing adequate static and dynamic channels based on the selection techniques. We also show that our architecture can reduce by more than half, the energy consumption necessary for arbitration compared to hybrid photonic ring and mesh NoCs. A comparison with several previous works in term of architecture hardware cost shows that our architecture can be an attractive cost-performance efficient interconnection infrastructure for future SoCs and CMPs.
Automated Route Planning for Milk-Run Transport Logistics with the NuSMV Model Checker
Takashi KITAMURA Keishi OKAMOTO

PAPER

Page(s):
2555-2564
In this paper, we propose and implement an automated route planning framework for milk-run transport logistics by applying model checking techniques. First, we develop a formal specification framework for milk-run transport logistics. The framework adopts LTL (Linear Temporal Logic), a language based on temporal logics, as a specification language for users to be able to flexibly and formally specify complex delivery requirements for trucks. Then by applying the bounded semantics of LTL, the framework then defines the notion of “optimal truck routes”, which mean truck routes on a given route map that satisfy given delivery requirements (specified by LTL) with the minimum cost. We implement the framework as an automated route planner using the NuSMV model checker, a state-of-the-art bounded model checker. The automated route planner, given route map and delivery requirements, automatically finds optimal trucks routes on the route map satisfying the given delivery requirements. The feasibility of the implementation design is investigated by analysing its computational complexity and by showing experimental results.
Location-Based Routing Scheme with Adaptive Request Zone in Mobile Ad Hoc Networks
Putthiphong KIRDPIPAT Sakchai THIPCHAKSURAT

PAPER

Page(s):
2565-2574
Route discovery process is a major mechanism in the most routing protocols in Mobile Ad Hoc Network (MANET). Routing overhead is one of the problems caused by broadcasting the route discovery packet. To reduce the routing overhead, the location-based routing schemes have been proposed. In this paper, we propose our scheme called Location-based Routing scheme with Adaptive Request Zone (LoRAReZ). In LoRAReZ scheme, the size of expected zone is set adaptively depending on the distance between source and destination nodes. Computer simulation has been conducted to show the effectiveness of our propose scheme. We evaluate the performances of LoRAReZ scheme in the terms of packet delivery fraction (PDF), routing overhead, average end-to-end delay, throughput, packet collision, average hop count, average route setup time, and power consumption. We compare those performance metrics with those of Location Aided Routing (LAR) and Location Aware Routing Protocol with Dynamic Adaptation of Request Zone (LARDAR) protocols. The simulation results show that LoRAReZ can provide all the better performances among those of LAR and LARDAR schemes.
HiCrypt: A Specialized Translator for Symmetric Block Cipher and GPGPU
Keisuke IWAI Naoki NISHIKAWA Takakazu KUROKAWA

PAPER

Page(s):
2575-2586
Many-core computer systems with GPUs are coming into mainstream use from high-end computing, including supercomputers, to embedded processors. Consequently, the implementation of cryptographic methods on GPGPU is also becoming popular because of such systems' performance. However, many factors affect the performance of GPUs. To cope with this problem, we developed a new translator, HiCrypt, which can generate an optimized GPGPU program written in both of CUDA and OpenCL from a cipher program written in standard C language with directives. Users must annotate only variables and an encoding/decoding function, which are characteristics of cipher programs, with directives. To evaluate the translator, five representative cipher programs are translated into CUDA and OpenCL programs by the translator. Generated programs perform high throughput almost identical to hand optimized programs for all five cipher programs. HiCrypt will contribute to development and evaluate of new and various symmetric block ciphers using GPGPU.
Simulating Cardiac Electrophysiology in the Era of GPU-Cluster Computing
Jun CHAI Mei WEN Nan WU Dafei HUANG Jing YANG Xing CAI Chunyuan ZHANG Qianming YANG

PAPER

Page(s):
2587-2595
This paper presents a study of the applicability of clusters of GPUs to high-resolution 3D simulations of cardiac electrophysiology. By experimenting with representative cardiac cell models and ODE solvers, in association with solving the monodomain equation, we quantitatively analyze the obtainable computational capacity of GPU clusters. It is found that for a 501×501×101 3D mesh, which entails a 0.1mm spatial resolution, a 128-GPU cluster only needs a few minutes to carry out a 100,000-time-step cardiac excitation simulation that involves a four-variable cell model. Even higher spatial and temporal resolutions are achievable for such simplified mathematical models. On the other hand, our experiments also show that a dramatically larger cluster of GPUs is needed to handle a very detailed cardiac cell model.
A GPU Implementation of Dynamic Programming for the Optimal Polygon Triangulation
Yasuaki ITO Koji NAKANO

PAPER

Page(s):
2596-2603
This paper presents a GPU (Graphics Processing Units) implementation of dynamic programming for the optimal polygon triangulation. Recently, GPUs can be used for general purpose parallel computation. Users can develop parallel programs running on GPUs using programming architecture called CUDA (Compute Unified Device Architecture) provided by NVIDIA. The optimal polygon triangulation problem for a convex polygon is an optimization problem to find a triangulation with minimum total weight. It is known that this problem for a convex n-gon can be solved using the dynamic programming technique in O(n³) time using a work space of size O(n²). In this paper, we propose an efficient parallel implementation of this O(n³)-time algorithm on the GPU. In our implementation, we have used two new ideas to accelerate the dynamic programming. The first idea (adaptive granularity) is to partition the dynamic programming algorithm into many sequential kernel calls of CUDA, and to select the best parameters for the size and the number of blocks for each kernel call. The second idea (sliding and mirroring arrangements) is to arrange the working data for coalesced access of the global memory in the GPU to minimize the memory access overhead. Our implementation using these two ideas solves the optimal polygon triangulation problem for a convex 8192-gon in 5.57 seconds on the NVIDIA GeForce GTX 680, while a conventional CPU implementation runs in 1939.02 seconds. Thus, our GPU implementation attains a speedup factor of 348.02.
GPU-Chariot: A Programming Framework for Stream Applications Running on Multi-GPU Systems
Fumihiko INO Shinta NAKAGAWA Kenichi HAGIHARA

PAPER

Page(s):
2604-2616
This paper presents a stream programming framework, named GPU-chariot, for accelerating stream applications running on graphics processing units (GPUs). The main contribution of our framework is that it realizes efficient software pipelines on multi-GPU systems by enabling out-of-order execution of CPU functions, kernels, and data transfers. To achieve this out-of-order execution, we apply a runtime scheduler that not only maximizes the utilization of system resources but also encapsulates the number of GPUs available in the system. In addition, we implement a load-balancing capability to flow data efficiently through multiple GPUs. Furthermore, a callback interface enables overlapping execution of functions in third-party libraries. By using kernels with different performance bottlenecks, we show that our out-of-order execution is up to 20% faster than in-order execution. Finally, we conduct several case studies on a 4-GPU system and demonstrate the advantages of GPU-chariot over a manually pipelined code. We conclude that GPU-chariot can be useful when developing stream applications with software pipelines on multiple GPUs and CPUs.
Offline Permutation Algorithms on the Discrete Memory Machine with Performance Evaluation on the GPU
Akihiko KASAGI Koji NAKANO Yasuaki ITO

PAPER

Page(s):
2617-2625
The Discrete Memory Machine (DMM) is a theoretical parallel computing model that captures the essence of the shared memory access of GPUs. Bank conflicts should be avoided for maximizing the bandwidth of the shared memory access. Offline permutation of an array is a task to copy all elements in array a into array b along a permutation given in advance. The main contribution of this paper is to implement a conflict-free permutation algorithm on the DMM in a GPU. We have also implemented straightforward permutation algorithms on the GPU. The experimental results for 1024 double (64-bit) numbers on NVIDIA GeForce GTX-680 show that the straightforward permutation algorithm takes 247.8 ns for the random permutation and 1684ns for the worst permutation that involves the maximum bank conflicts. Our conflict-free permutation algorithm runs in 167ns for any permutation including the random permutation and the worst permutation, although it performs more memory accesses. It follows that our conflict-free permutation is 1.48 times faster for the random permutation and 10.0 times faster for the worst permutation.
Optimal Parallel Algorithms for Computing the Sum, the Prefix-Sums, and the Summed Area Table on the Memory Machine Models
Koji NAKANO

PAPER

Page(s):
2626-2634
The main contribution of this paper is to show optimal parallel algorithms to compute the sum, the prefix-sums, and the summed area table on two memory machine models, the Discrete Memory Machine (DMM) and the Unified Memory Machine (UMM). The DMM and the UMM are theoretical parallel computing models that capture the essence of the shared memory and the global memory of GPUs. These models have three parameters, the number p of threads, and the width w of the memory, and the memory access latency l. We first show that the sum of n numbers can be computed in $O({nover w}+{nlover p}+llog n)$ time units on the DMM and the UMM. We then go on to show that $Omega({nover w}+{nlover p}+llog n)$ time units are necessary to compute the sum. We also present a parallel algorithm that computes the prefix-sums of n numbers in $O({nover w}+{nlover p}+llog n)$ time units on the DMM and the UMM. Finally, we show that the summed area table of size $sqrt{n} imessqrt{n}$ can be computed in $O({nover w}+{nlover p}+llog n)$ time units on the DMM and the UMM. Since the computation of the prefix-sums and the summed area table is at least as hard as the sum computation, these parallel algorithms are also optimal.
An Improved Rete Algorithm Based on Double Hash Filter and Node Indexing for Distributed Rule Engine
Tianyang DONG Jianwei SHI Jing FAN Ling ZHANG

PAPER

Page(s):
2635-2644
Rule engine technologies have been widely used in the development of enterprise information systems. However, these rule-based systems may suffer the problem of low performance, when there is a large amount of facts data to be matched with the rules. The way of cluster or grid to construct rule engines can flexibly expand system processing capability by increasing cluster scale, and acquire shorter response time. In order to speed up pattern matching in rule engine, a double hash filter approach for alpha network, combined with beta node indexing, is proposed to improve Rete algorithm in this paper. By using fact type node in Rete network, a hash map about ‘fact type - fact type node’ is built in root node, and hash maps about ‘attribute constraint - alpha node’ are constructed in fact type nodes. This kind of double hash mechanism can speed up the filtration of facts in alpha network. Meanwhile, hash tables with the indexes calculated through fact objects, are built in memories of beta nodes, to avoid unnecessary iteration in the join operations of beta nodes. In addition, rule engine based on this improved Rete algorithm is applied in the enterprise information systems. The experimental results show that this method can effectively speed up the pattern matching, and significantly decrease the response time of the application systems.
A Meta-Heuristic Approach for Dynamic Data Allocation on a Multiple Web Server System
Masaki KOHANA Shusuke OKAMOTO Atsuko IKEGAMI

PAPER

Page(s):
2645-2653
This paper describes a near-optimal allocation method for web-based multi-player online role-playing games (MORPGs), which must be able to cope with a large number of users and high frequency of user requests. Our previous work introduced a dynamic data reallocation method. It uses multiple web servers and divides the entire game world into small blocks. Each ownership of block is allocated to a web server. Additionally, the ownership is reallocated to the other web server according to the user's requests. Furthermore, this block allocation was formulated as a combinational optimization problem. And a simulation based experiment with an exact algorithm showed that our system could achieve 31% better than an ad-hoc approach. However, the exact algorithm takes too much time to solve a problem when the problem size is large. This paper proposes a meta-heuristic approach based on a tabu search to solve a problem quickly. A simulation result shows that our tabu search algorithm can generate solutions, whose average correctness is only 1% different from that of the exact algorithm. In addition, the average calculation time for 50 users on a system with five web servers is about 25.67 msec while the exact algorithm takes about 162 msec. An evaluation for a web-based MORPG system with our tabu search shows that it could achieve 420 users capacity while 320 for our previous system.
An Efficiency-Aware Scheduling for Data-Intensive Computations on MapReduce Clusters
Hui ZHAO Shuqiang YANG Hua FAN Zhikun CHEN Jinghu XU

PAPER

Page(s):
2654-2662
Scheduling plays a key role in MapReduce systems. In this paper, we explore the efficiency of an MapReduce cluster running lots of independent and continuously arriving MapReduce jobs. Data locality and load balancing are two important factors to improve computation efficiency in MapReduce systems for data-intensive computations. Traditional cluster scheduling technologies are not well suitable for MapReduce environment, there are some in-used schedulers for the popular open-source Hadoop MapReduce implementation, however, they can not well optimize both factors. Our main objective is to minimize total flowtime of all jobs, given it's a strong NP-hard problem, we adopt some effective heuristics to seek satisfied solution. In this paper, we formalize the scheduling problem as job selection problem, a load balance aware job selection algorithm is proposed, in task level we design a strict data locality tasks scheduling algorithm for map tasks on map machines and a load balance aware scheduling algorithm for reduce tasks on reduce machines. Comprehensive experiments have been conducted to compare our scheduling strategy with well-known Hadoop scheduling strategies. The experimental results validate the efficiency of our proposed scheduling strategy.
A WAN-Optimized Live Storage Migration Mechanism toward Virtual Machine Evacuation upon Severe Disasters
Takahiro HIROFUCHI Mauricio TSUGAWA Hidemoto NAKADA Tomohiro KUDOH Satoshi ITOH

PAPER

Page(s):
2663-2674
Wide-area VM migration is a technology with potential to aid IT services recovery since it can be used to evacuate virtualized servers to safe locations upon a critical disaster. However, the amount of data involved in a wide-area VM migration is substantially larger compared to VM migrations within LAN due to the need to transfer virtualized storage in addition to memory and CPU states. This increase of data makes it challenging to relocate VMs under a limited time window with electrical power. In this paper, we propose a mechanism to improve live storage migration across WAN. The key idea is to reduce the amount of data to be transferred by proactively caching virtual disk blocks to a backup site during regular VM operation. As a result of pre-cached disk blocks, the proposed mechanism can dramatically reduce the amount of data and consequently the time required to live migrate the entire VM state. The mechanism was evaluated using a prototype implementation under different workloads and network conditions, and we confirmed that it dramatically reduces the time to complete a VM live migration. By using the proposed mechanism, it is possible to relocate a VM from Japan to the United States in just under 40 seconds. This relocation would otherwise take over 1500 seconds, demonstrating that the proposed mechanism was able to reduce the migration time by 97.5%.
Cooperative VM Migration: A Symbiotic Virtualization Mechanism by Leveraging the Guest OS Knowledge
Ryousei TAKANO Hidemoto NAKADA Takahiro HIROFUCHI Yoshio TANAKA Tomohiro KUDOH

PAPER

Page(s):
2675-2683
A virtual machine (VM) migration is useful for improving flexibility and maintainability in cloud computing environments. However, VM monitor (VMM)-bypass I/O technologies, including PCI passthrough and SR-IOV, in which the overhead of I/O virtualization can be significantly reduced, make VM migration impossible. This paper proposes a novel and practical mechanism, called Symbiotic Virtualization (SymVirt), for enabling migration and checkpoint/restart on a virtualized cluster with VMM-bypass I/O devices, without the virtualization overhead during normal operations. SymVirt allows a VMM to cooperate with a message passing layer on the guest OS, then it realizes VM-level migration and checkpoint/restart by using a combination of a user-level dynamic device configuration and coordination of distributed VMMs. We have implemented the proposed mechanism on top of QEMU/KVM and the Open MPI system. All PCI devices, including Infiniband, Ethernet, and Myrinet, are supported without implementing specific para-virtualized drivers; and it is not necessary to modify either of the MPI runtime and applications. Using the proposed mechanism, we demonstrate reactive and proactive FT mechanisms on a virtualized Infiniband cluster. We have confirmed the effectiveness using both a memory intensive micro benchmark and the NAS parallel benchmark.
An Approximated Selection Algorithm for Combinations of Content with Virtual Local Server for Traffic Localization in Peer-Assisted Content Delivery Networks
Naoya MAKI Ryoichi SHINKUMA Tatsuro TAKAHASHI

PAPER

Page(s):
2684-2695
Our prior papers proposed a traffic engineering scheme to further localize traffic in peer-assisted content delivery networks (CDNs). This scheme periodically combines the content files and allows them to obtain the combined content files while keeping the price unchanged from the single-content price in order to induce altruistic clients to download content files that are most likely to contribute to localizing network traffic. However, the selection algorithm in our prior work determined which and when content files should be combined according to the cache states of all clients, which is a kind of unrealistic assumption in terms of computational complexity. This paper proposes a new concept of virtual local server to reduce the computational complexity. We could say that the source server in our mechanism has a virtual caching network inside that reflects the cache states of all clients in the ‘actual’ caching network and combines content files based on the virtual caching network. In this paper, without determining virtual caching network according to the cache states of all clients, we approximately estimated the virtual caching network from the cache states of the virtual local server of the local domain, which is the aggregated cache state of only altruistic clients in a local domain. Furthermore, we proposed a content selection algorithm based on a virtual caching network. In this paper, we used news life-cycle model as a content model that had the severe changes in cache states, which was a striking instance of dynamic content models. Computer simulations confirmed that our proposed algorithm successfully localized network traffic.
Reputation-Based Colluder Detection Schemes for Peer-to-Peer Content Delivery Networks
Ervianto ABDULLAH Satoshi FUJITA

PAPER

Page(s):
2696-2703
Recently Peer-to-Peer Content Delivery Networks (P2P CDNs) have attracted considerable attention as a cost-effective way to disseminate digital contents to paid users in a scalable and dependable manner. However, due to its peer-to-peer nature, it faces threat from “colluders” who paid for the contents but illegally share them with unauthorized peers. This means that the detection of colluders is a crucial task for P2P CDNs to preserve the right of contents holders and paid users. In this paper, we propose two colluder detection schemes for P2P CDNs. The first scheme is based on the reputation collected from all peers participating in the network and the second scheme improves the quality of colluder identification by using a technique which is well known in the field of system level diagnosis. The performance of the schemes is evaluated by simulation. The simulation results indicate that even when 10% of authorized peers are colluders, our schemes identify all colluders without causing misidentifications.
An Auction Based Distribute Mechanism for P2P Adaptive Bandwidth Allocation
Fang ZUO Wei ZHANG

PAPER

Page(s):
2704-2712
In P2P applications, networks are formed by devices belonging to independent users. Therefore, routing hotspots or routing congestions are typically created by an unanticipated new event that triggers an unanticipated surge of users to request streaming service from some particular nodes; and a challenging problem is how to provide incentive mechanisms to allocation bandwidth more fairly in order to avoid congestion and other short backs for P2P QoS. In this paper, we study P2P bandwidth game — the bandwidth allocation in P2P networks. Unlike previous works which focus either on routing or on forwarding, this paper investigates the game theoretic mechanism to incentivize node's real bandwidth demands and propose novel method that avoid congestion proactively, that is, prior to a congestion event. More specifically, we define an incentive-compatible pricing vector explicitly and give theoretical proofs to demonstrate that our mechanism can provide incentives for nodes to tell the true bandwidth demand. In order to apply this mechanism to the P2P distribution applications, we evaluate our mechanism by NS-2 simulations. The simulation results show that the incentive pricing mechanism can distribute the bandwidth fairly and effectively and can also avoid the routing hotspot and congestion effectively.
A Cost-Effective Buffer Map Notification Scheme for P2P VoDs Supporting VCR Operations
Ryusuke UEDERA Satoshi FUJITA

PAPER

Page(s):
2713-2719
In this paper, we propose a new buffer map notification scheme for Peer-to-Peer Video-on-Demand systems (P2P VoDs) which support VCR operations such as fast-forward, fast-backward, and seek. To enhance the fluidity of such VCR operations, we need to refine the size of each piece as small as possible. However, such a refinement significantly degrades the performance of buffer map notification schemes with respect to the overhead, piece availability and the efficiency of resource utilizations. The basic idea behind our proposed scheme is to use a piece-based buffer map with a segment-based buffer map in a complementary manner. The result of simulations indicates that the proposed scheme certainly increases the accuracy of the information on the piece availability in the neighborhood with a sufficiently low cost, which reduces the intermittent waiting time of each peer by more than 40% even under a situation in which 50% of peers conduct the fast-forward operation over a range of 30% of the entire video.
Synchronization-Aware Virtual Machine Scheduling for Parallel Applications in Xen
Cheol-Ho HONG Chuck YOO

LETTER

Page(s):
2720-2723
In this paper, we propose a synchronization-aware VM scheduler for parallel applications in Xen. The proposed scheduler prevents threads from waiting for a significant amount of time during synchronization. For this purpose, we propose an identification scheme that can identify the threads that have awaited other threads for a long time. In this scheme, a detection module that can infer the internal status of guest OSs was developed. We also present a scheduling policy that can accelerate bottlenecks of concurrent VMs. We implemented our VM scheduler in the recent Xen hypervisor with para-virtualized Linux-based operating systems. We show that our approach can improve the performance of concurrent VMs by up to 43% as compared to the credit scheduler.
An Efficient O(1) Contrast Enhancement Algorithm Using Parallel Column Histograms
Yan-Tsung PENG Fan-Chieh CHENG Shanq-Jang RUAN

LETTER

Page(s):
2724-2725
Display devices play image files, of which contrast enhancement methods are usually employed to bring out visual details to achieve better visual quality. However, applied to high resolution images, the contrast enhancement method entails high computation costs mostly due to histogram computations. Therefore, this letter proposes a parallel histogram calculation algorithm using the column histograms and difference histograms to reduce histogram computations. Experimental results show that the proposed algorithm is effective for histogram-based image contrast enhancement.

Regular Section

Lower-Energy Structure Optimization of (C₆₀)_N Clusters Using an Improved Genetic Algorithm
Guifang SHAO Wupeng HONG Tingna WANG Yuhua WEN

PAPER-Fundamentals of Information Systems

Page(s):
2726-2732
An improved genetic algorithm is employed to optimize the structure of (C₆₀)_N (N≤25) fullerene clusters with the lowest energy. First, crossover with variable precision, realized by introducing the hamming distance, is developed to provide a faster search mechanism. Second, the bit string mutation and feedback mutation are incorporated to maintain the diversity in the population. The interaction between C₆₀ molecules is described by the Pacheco and Ramalho potential derived from first-principles calculations. We compare the performance of the Improved GA (IGA) with that of the Standard GA (SGA). The numerical and graphical results verify that the proposed approach is faster and more robust than the SGA. The second finite differential of the total energy shows that the (C₆₀)_N clusters with N=7, 13, 22 are particularly stable. Performance with the lowest energy is achieved in this work.
Teachability of a Subclass of Simple Deterministic Languages
Yasuhiro TAJIMA

PAPER-Fundamentals of Information Systems

Page(s):
2733-2742
We show teachability of a subclass of simple deterministic languages. The subclass we define is called stack uniform simple deterministic languages. Teachability is derived by showing the query learning algorithm for this language class. Our learning algorithm uses membership, equivalence and superset queries. Then, it terminates in polynomial time. It is already known that simple deterministic languages are polynomial time query learnable by context-free grammars. In contrast, our algorithm guesses a hypothesis by a stack uniform simple deterministic grammar, thus our result is strict teachability of the subclass of simple deterministic languages. In addition, we discuss parameters of the polynomial for teachability. The “thickness” is an important parameter for parsing and it should be one of parameters to evaluate the time complexity.
Improvement of Steiner Tree Algorithm: Branch-Based Multi-Cast
Hiroshi MATSUURA

PAPER-Fundamentals of Information Systems

Page(s):
2743-2752
There is a well known Steiner tree algorithm called minimum-cost paths heuristic (MPH), which is used for many multicast network operations and is considered a benchmark for other Steiner tree algorithms. MPH's average case time complexity is O(m(l+nlog n)), where m is the number of end nodes, n is the number of nodes, and l is the number of links in the network, because MPH has to run Dijkstra's algorithm as many times as the number of end nodes. The author recently proposed a Steiner tree algorithm called branch-based multi-cast (BBMC), which produces exactly the same multicast tree as MPH in a constant processing time irrespective of the number of multicast end nodes. However, the theoretical result for the average case time complexity of BBMC was expressed as O(log m(l+nlog n)) and could not accurately reflect the above experimental result. This paper proves that the average case time complexity of BBMC can be shortened to O(l+nlog n), which is independent of the number of end nodes, when there is an upper limit of the node degree, which is the number of links connected to a node. In addition, a new parameter β is applied to BBMC, so that the multicast tree created by BBMC has less links on it. Even though the tree costs increase due to this parameter, the tree cost increase rates are much smaller than the link decrease rates.
Vertical Link On/Off Regulations for Inductive-Coupling Based Wireless 3-D NoCs
Hao ZHANG Hiroki MATSUTANI Yasuhiro TAKE Tadahiro KURODA Hideharu AMANO

PAPER-Computer System

Page(s):
2753-2764
We propose low-power techniques for wireless three-dimensional Network-on-Chips (wireless 3-D NoCs), in which the connections among routers on the same chip are wired while the routers on different chips are connected wirelessly using inductive-coupling. The proposed low-power techniques stop the clock and power supplies to the transmitter of the wireless vertical links only when their utilizations are higher than the threshold. Meanwhile, the whole wireless vertical link will be shut down when the utilization is lower than the threshold in order to reduce the power consumption of wireless 3-D NoCs. This paper uses an on-demand method, in which the dormant data transmitter or the whole vertical link will be activated as long as a flit comes. Full-system many-core simulations using power parameters derived from a real chip implementation show that the proposed low-power techniques reduce the power consumption by 23.4%-29.3%, while the performance overhead is less than 2.4%.
Window Memory Layout Scheme for Alternate Row-Wise/Column-Wise Matrix Access
Lei GUO Yuhua TANG Yong DOU Yuanwu LEI Meng MA Jie ZHOU

PAPER-Computer System

Page(s):
2765-2775
The effective bandwidth of the dynamic random-access memory (DRAM) for the alternate row-wise/column-wise matrix access (AR/CMA) mode, which is a basic characteristic in scientific and engineering applications, is very low. Therefore, we propose the window memory layout scheme (WMLS), which is a matrix layout scheme that does not require transposition, for AR/CMA applications. This scheme maps one row of a logical matrix into a rectangular memory window of the DRAM to balance the bandwidth of the row- and column-wise matrix access and to increase the DRAM IO bandwidth. The optimal window configuration is theoretically analyzed to minimize the total number of no-data-visit operations of the DRAM. Different WMLS implementationsare presented according to the memory structure of field-programmable gata array (FPGA), CPU, and GPU platforms. Experimental results show that the proposed WMLS can significantly improve DRAM bandwidth for AR/CMA applications. achieved speedup factors of 1.6× and 2.0× are achieved for the general-purpose CPU and GPU platforms, respectively. For the FPGA platform, the WMLS DRAM controller is custom. The maximum bandwidth for the AR/CMA mode reaches 5.94 GB/s, which is a 73.6% improvement compared with that of the traditional row-wise access mode. Finally, we apply WMLS scheme for Chirp Scaling SAR application, comparing with the traditional access approach, the maximum speedup factors of 4.73X, 1.33X and 1.56X can be achieved for FPGA, CPU and GPU platform, respectively.
Accelerating Range Query Processing on R-Tree Using Graphics Processing Units
Boseon YU Hyunduk KIM Wonik CHOI Dongseop KWON

PAPER-Data Engineering, Web Information Systems

Page(s):
2776-2785
Recently, various research efforts have been conducted to develop strategies for accelerating multi-dimensional query processing using the graphics processing units (GPUs). However, well-known multi-dimensional access methods such as the R-tree, B-tree, and their variants are hardly applicable to GPUs in practice, mainly due to the characteristics of a hierarchical index structure. More specifically, the hierarchical structure not only causes frequent transfers of small volumes of data but also provides limited opportunity to exploit the advanced data parallelism of GPUs. To address these problems, we propose an approach that uses GPUs as a buffer. The main idea is that object entries in recently visited leaf nodes are buffered in the global memory of GPUs and processed by massive parallel threads of the GPUs. Through extensive performance studies, we observed that the proposed approach achieved query performance up to five times higher than that of the original R-tree.
Improving Text Categorization with Semantic Knowledge in Wikipedia
Xiang WANG Yan JIA Ruhua CHEN Hua FAN Bin ZHOU

PAPER-Artificial Intelligence, Data Mining

Page(s):
2786-2794
Text categorization, especially short text categorization, is a difficult and challenging task since the text data is sparse and multidimensional. In traditional text classification methods, document texts are represented with “Bag of Words (BOW)” text representation schema, which is based on word co-occurrence and has many limitations. In this paper, we mapped document texts to Wikipedia concepts and used the Wikipedia-concept-based document representation method to take the place of traditional BOW model for text classification. In order to overcome the weakness of ignoring the semantic relationships among terms in document representation model and utilize rich semantic knowledge in Wikipedia, we constructed a semantic matrix to enrich Wikipedia-concept-based document representation. Experimental evaluation on five real datasets of long and short text shows that our approach outperforms the traditional BOW method.
A Practical and Optimal Path Planning for Autonomous Parking Using Fast Marching Algorithm and Support Vector Machine
Quoc Huy DO Seiichi MITA Keisuke YONEDA

PAPER-Artificial Intelligence, Data Mining

Page(s):
2795-2804
This paper proposes a novel practical path planning framework for autonomous parking in cluttered environments with narrow passages. The proposed global path planning method is based on an improved Fast Marching algorithm to generate a path while considering the moving forward and backward maneuver. In addition, the Support Vector Machine is utilized to provide the maximum clearance from obstacles considering the vehicle dynamics to provide a safe and feasible path. The algorithm considers the most critical points in the map and the complexity of the algorithm is not affected by the shape of the obstacles. We also propose an autonomous parking scheme for different parking situation. The method is implemented on autonomous vehicle platform and validated in the real environment with narrow passages.
Unsupervised Sentiment-Bearing Feature Selection for Document-Level Sentiment Classification
Yan LI Zhen QIN Weiran XU Heng JI Jun GUO

PAPER-Pattern Recognition

Page(s):
2805-2813
Text sentiment classification aims to automatically classify subjective documents into different sentiment-oriented categories (e.g. positive/negative). Given the high dimensionality of features describing documents, how to effectively select the most useful ones, referred to as sentiment-bearing features, with a lack of sentiment class labels is crucial for improving the classification performance. This paper proposes an unsupervised sentiment-bearing feature selection method (USFS), which incorporates sentiment discriminant analysis (SDA) into sentiment strength calculation (SSC). SDA applies traditional linear discriminant analysis (LDA) in an unsupervised manner without losing local sentiment information between documents. We use SSC to calculate the overall sentiment strength for each single feature based on its affinities with some sentiment priors. Experiments, performed using benchmark movie reviews, demonstrated the superior performance of USFS.
A Robust Signal Recognition Method for Communication System under Time-Varying SNR Environment
Jing-Chao LI Yi-Bing LI Shouhei KIDERA Tetsuo KIRIMOTO

PAPER-Pattern Recognition

Page(s):
2814-2819
As a consequence of recent developments in communications, the parameters of communication signals, such as the modulation parameter values, are becoming unstable because of time-varying SNR under electromagnetic conditions. In general, it is difficult to classify target signals that have time-varying parameters using traditional signal recognition methods. To overcome this problem, this study proposes a novel recognition method that works well even for such time-dependent communication signals. This method is mainly composed of feature extraction and classification processes. In the feature extraction stage, we adopt Shannon entropy and index entropy to obtain the stable features of modulated signals. In the classification stage, the interval gray relation theory is employed as suitable for signals with time-varying parameter spaces. The advantage of our method is that it can deal with time-varying SNR situations, which cannot be handled by existing methods. The results from numerical simulation show that the proposed feature extraction algorithm, based on entropy characteristics in time-varying SNR situations,offers accurate clustering performance, and the classifier, based on interval gray relation theory, can achieve a recognition rate of up to 82.9%, even when the SNR varies from -10 to -6 dB.
Robust Multi-Bit Watermarking for Free-View Television Using Light Field Rendering
Huawei TIAN Yao ZHAO Zheng WANG Rongrong NI Lunming QIN

PAPER-Image Processing and Video Processing

Page(s):
2820-2829
With the rapid development of multi-view video coding (MVC) and light field rendering (LFR), Free-View Television (FTV) has emerged as new entrainment equipment, which can bring more immersive and realistic feelings for TV viewers. In FTV broadcasting system, the TV-viewer can freely watch a realistic arbitrary view of a scene generated from a number of original views. In such a scenario, the ownership of the multi-view video should be verified not only on the original views, but also on any virtual view. However, capacities of existing watermarking schemes as copyright protection methods for LFR-based FTV are only one bit, i.e., presence or absence of the watermark, which seriously impacts its usage in practical scenarios. In this paper, we propose a robust multi-bit watermarking scheme for LFR-based free-view video. The direct-sequence code division multiple access (DS-CDMA) watermark is constructed according to the multi-bit message and embedded into DCT domain of each view frame. The message can be extracted bit-by-bit from a virtual frame generated at an arbitrary view-point with a correlation detector. Furthermore, we mathematically prove that the watermark can be detected from any virtual view. Experimental results also show that the watermark in FTV can be successfully detected from a virtual view. Moreover, the proposed watermark method is robust against common signal processing attacks, such as Gaussian filtering, salt & peppers noising, JPEG compression, and center cropping.
Nonlinear Metric Learning with Deep Independent Subspace Analysis Network for Face Verification
Xinyuan CAI Chunheng WANG Baihua XIAO Yunxue SHAO

PAPER-Image Recognition, Computer Vision

Page(s):
2830-2838
Face verification is the task of determining whether two given face images represent the same person or not. It is a very challenging task, as the face images, captured in the uncontrolled environments, may have large variations in illumination, expression, pose, background, etc. The crucial problem is how to compute the similarity of two face images. Metric learning has provided a viable solution to this problem. Until now, many metric learning algorithms have been proposed, but they are usually limited to learning a linear transformation. In this paper, we propose a nonlinear metric learning method, which learns an explicit mapping from the original space to an optimal subspace using deep Independent Subspace Analysis (ISA) network. Compared to the linear or kernel based metric learning methods, the proposed deep ISA network is a deep and local learning architecture, and therefore exhibits more powerful ability to learn the nature of highly variable dataset. We evaluate our method on the Labeled Faces in the Wild dataset, and results show superior performance over some state-of-the-art methods.
Improved Color Barycenter Model and Its Separation for Road Sign Detection
Qieshi ZHANG Sei-ichiro KAMATA

PAPER-Image Recognition, Computer Vision

Page(s):
2839-2849
This paper proposes an improved color barycenter model (CBM) and its separation for automatic road sign (RS) detection. The previous version of CBM can find out the colors of RS, but the accuracy is not high enough for separating the magenta and blue regions and the influence of number with the same color are not considered. In this paper, the improved CBM expands the barycenter distribution to cylindrical coordinate system (CCS) and takes the number of colors at each position into account for clustering. Under this distribution, the color information can be represented more clearly for analyzing. Then aim to the characteristic of barycenter distribution in CBM (CBM-BD), a constrained clustering method is presented to cluster the CBM-BD in CCS. Although the proposed clustering method looks like conventional K-means in some part, it can solve some limitations of K-means in our research. The experimental results show that the proposed method is able to detect RS with high robustness.
Depth Perception Control during Car Vibration by Hidden Images on Monocular Head-Up Display
Tsuyoshi TASAKI Akihisa MORIYA Aira HOTTA Takashi SASAKI Haruhiko OKUMURA

PAPER-Multimedia Pattern Processing

Page(s):
2850-2856
A novel depth perception control method for a monocular head-up display (HUD) in a car has been developed, which is called the dynamic perspective method. The method changes a size and a position of the HUD image such as arrow for depth perception and achieves a depth perception position of 120 [m] within an error of 30% in a simulation. However, it is difficult to achieve an accurate depth perception in the real world because of car vibration. To solve this problem, we focus on a property, namely, that people complement hidden images by previous continuously observed images. We hide the image on the HUD when the car is vibrated very much. We aim to point at the accurate depth position by using see-through HUD images while having users complement the hidden image positions based on the continuous images before car vibration. We developed a car that detects big vibration by an acceleration sensor and is equipped with our monocular HUD. Our new method pointed at the depth position more accurately than the previous method, which was confirmed by t-test.
Semi-Automatically Extracting Features from Source Code of Android Applications
Tetsuya KANDA Yuki MANABE Takashi ISHIO Makoto MATSUSHITA Katsuro INOUE

LETTER-Software Engineering

Page(s):
2857-2859
It is not always easy for an Android user to choose the most suitable application for a particular task from the great number of applications available. In this paper, we propose a semi-automatic approach to extract feature names from Android applications. The case study verifies that we can associate common sequences of Android API calls with feature names.
Apps at Hand: Personalized Live Homescreen Based on Mobile App Usage Prediction
Xiao XIA Xinye LIN Xiaodong WANG Xingming ZHOU Deke GUO

LETTER-Information Network

Page(s):
2860-2864
To facilitate the discovery of mobile apps in personal devices, we present the personalized live homescreen system. The system mines the usage patterns of mobile apps, generates personalized predictions, and then makes apps available at users' hands whenever they want them. Evaluations have verified the promising effectiveness of our system.
A Trusted Network Access Protocol for WLAN Mesh Networks
Yuelei XIAO Yumin WANG Liaojun PANG Shichong TAN

LETTER-Information Network

Page(s):
2865-2869
To solve the problems of the existing trusted network access protocols for Wireless Local Area Network (WLAN) mesh networks, we propose a new trusted network access protocol for WLAN mesh networks, which is abbreviated as WMN-TNAP. This protocol implements mutual user authentication and Platform-Authentication between the supplicant and Mesh Authenticator (MA), and between the supplicant and Authentication Server (AS) of a WLAN mesh network, establishes the key management system for the WLAN mesh network, and effectively prevents the platform configuration information of the supplicant, MA and AS from leaking out. Moreover, this protocol is proved secure based on the extended Strand Space Model (SSM) for trusted network access protocols.
Distributed One-Time Keyboard Systems
YoungLok PARK MyungKeun YOON

LETTER-Dependable Computing

Page(s):
2870-2872
When attackers compromise a client system, they can steal user input. We propose a distributed one-time keyboard system to prevent information leakage via keyboard typing. We define the problem of secure keyboard arrangement over distributed multi-devices and channels. An analytical model is proposed for the optimal keyboard layout.
Personal Information Extraction from Korean Obituaries
Kyoung-Soo HAN

LETTER-Artificial Intelligence, Data Mining

Page(s):
2873-2876
Pieces of personal information, such as personal names and relationships, are crucial in text mining applications. Obituaries are good sources for this kind of information. This study proposes an effective method for extracting various facts about people from obituary Web pages. Experiments show that the proposed method achieves high performance in terms of recall and precision.
Robot Exploration in a Dynamic Environment Using Hexagonal Grid Coverage
Kihong KIM SeongOun HWANG

LETTER-Artificial Intelligence, Data Mining

Page(s):
2877-2881
Robot covering problem has gained attention as having the most promising applications in our real life. Previous spanning tree coverage algorithm addressed this problem well in a static environment, but not in a dynamic one. In this paper, we present and analyze our algorithm workable in a dynamic environment with less shadow areas.
A Novel Pedestrian Detector on Low-Resolution Images: Gradient LBP Using Patterns of Oriented Edges
Ahmed BOUDISSA Joo Kooi TAN Hyoungseop KIM Takashi SHINOMIYA Seiji ISHIKAWA

LETTER-Pattern Recognition

Page(s):
2882-2887
This paper introduces a simple algorithm for pedestrian detection on low resolution images. The main objective is to create a successful means for real-time pedestrian detection. While the framework of the system consists of edge orientations combined with the local binary patterns (LBP) feature extractor, a novel way of selecting the threshold is introduced. Using the mean-variance of the background examples this threshold improves significantly the detection rate as well as the processing time. Furthermore, it makes the system robust to uniformly cluttered backgrounds, noise and light variations. The test data is the INRIA pedestrian dataset and for the classification, a support vector machine with a radial basis function (RBF) kernel is used. The system performs at state-of-the-art detection rates while being intuitive as well as very fast which leaves sufficient processing time for further operations such as tracking and danger estimation.
Voice Activity Detection Based on Generalized Normal-Laplace Distribution Incorporating Conditional MAP
Ji-Hyun SONG Sangmin LEE

LETTER-Speech and Hearing

Page(s):
2888-2891
In this paper, we propose a novel voice activity detection (VAD) algorithm based on the generalized normal-Laplace (GNL) distribution to provide enhanced performance in adverse noise environments. Specifically, the probability density function (PDF) of a noisy speech signal is represented by the GNL distribution; the variance of the speech and noise of the GNL distribution are estimated using higher-order moments. After in-depth analysis of estimated variances, a feature that is useful for discrimination between speech and noise at low SNRs is derived and compared to a threshold to detect speech activity. To consider the inter-frame correlation of speech activity, the result from the previous frame is employed in the decision rule of the proposed VAD algorithm. The performance of our proposed VAD algorithm is evaluated in terms of receiver operating characteristics (ROC) and detection accuracy. Results show that the proposed method yields better results than conventional VAD algorithms.
Pixel and Patch Reordering for Fast Patch Selection in Exemplar-Based Image Inpainting
Baeksop KIM Jiseong KIM Jungmin SO

LETTER-Image Processing and Video Processing

Page(s):
2892-2895
This letter presents a scheme to improve the running time of exemplar-based image inpainting, first proposed by Criminisi et al. In the exemplar-based image inpainting, a patch that contains unknown pixels is compared to all the patches in the known region in order to find the best match. This is very time-consuming and hinders the practicality of Criminisi's method to be used in real time. We show that a simple bounding algorithm can significantly reduce number of distance calculations, and thus the running time. Performance of the bounding algorithm is affected by the order of patches that are compared, as well as the order of pixels in a patch. We present pixel and patch ordering schemes that improve the performance of bounding algorithms. Experiments with well-known images used in inpainting literature show that the proposed reordering scheme can reduce running time of the bounding algorithm up to 50%.
Modeling Interactions between Low-Level and High-Level Features for Human Action Recognition
Wen ZHOU Chunheng WANG Baihua XIAO Zhong ZHANG Yunxue SHAO

LETTER-Image Recognition, Computer Vision

Page(s):
2896-2899
Recognizing human action in complex scenes is a challenging problem in computer vision. Some action-unrelated concepts, such as camera position features, could significantly affect the appearance of local spatio-temporal features, and therefore the performance of low-level features based methods degrades. In this letter, we define the action-unrelated concept: the position of camera as high-level features. We observe that they can serve as a prior to local spatio-temporal features for human action recognition. We encode this prior by modeling interactions between spatio-temporal features and camera position features. We infer camera position features from local spatio-temporal features via these interactions. The parameters of this model are estimated by a new max-margin algorithm. We evaluate the proposed method on KTH, IXMAS and Youtube actions datasets. Experimental results show the effectiveness of the proposed method.
Multiple-Shot Person Re-Identification by Pairwise Multiple Instance Learning
Chunxiao LIU Guijin WANG Xinggang LIN

LETTER-Image Recognition, Computer Vision

Page(s):
2900-2903
Learning an appearance model for person re-identification from multiple images is challenging due to the corrupted images caused by occlusion or false detection. Furthermore, different persons may wear similar clothes, making appearance feature less discriminative. In this paper, we first introduce the concept of multiple instance to handle corrupted images. Then a novel pairwise comparison based multiple instance learning framework is proposed to deal with visual ambiguity, by selecting robust features through pairwise comparison. We demonstrate the effectiveness of our method on two public datasets.
A New Face Relighting Method Based on Edge-Preserving Filter
Lingyu LIANG Lianwen JIN

LETTER-Computer Graphics

Page(s):
2904-2907
We propose a new face relighting method using an illuminance template generated from a single reference portrait. First, the reference is wrapped according to the shape of the target. Second, we employ a new spatially variant edge-preserving smoothing filter to remove the facial identity and texture details of the wrapped reference, and obtain the illumination template. Finally, we relight the target with the template in CIELAB color space. Experiments show the effectiveness of our method for both grayscale and color faces taken from different databases, and the comparisons with previous works demonstrate a better relighting effect produced by our method.

IEICE TRANSACTIONS on Information

Advance publication (published online immediately after acceptance)

Volume E96-D No.12 (Publication Date:2013/12/01)

FOREWORD Open Access

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles