IEICE TRANSACTIONS on Information

Impact Factor

0.72
Eigenfactor

0.002
article influence

0.1
Cite Score

1.4

To the Advance publication
To the Archives

Advance publication (published online immediately after acceptance)

Vision Transformer with Key-select Routing Attention for Single Image Dehazing
Lihan TONG Weijia LI Qingxia YANG Liyuan CHEN Peng CHEN

Pubricized:
2024/07/01
- Summary
- Free PDF (2MB)
Towards Superior Pruning Performance in Federated Learning with Discriminative Data
Yinan YANG

Pubricized:
2024/06/27
- Summary
- Free PDF (7.9MB)
CLEAR & RETURN: Stopping Run-time Countermeasures in Cryptographic Primitives
Myung-Hyun KIM Seungkwang LEE

Pubricized:
2024/06/26
- Summary
- Free PDF (1.7MB)
SH-YOLO: Small Target High Performance YOLO for abnormal behavior detection in escalator scene
Shuoyan LIU Chao LI Yuxin LIU Yanqiu WANG

Pubricized:
2024/06/26
- Summary
- Free PDF (708.4KB)
Design and implementation of opto-electrical hybrid floating-point multipliers
Takumi INABA Takatsugu ONO Koji INOUE Satoshi KAWAKAMI

Pubricized:
2024/06/26
- Summary
- Free PDF (2.5MB)
Geometric Refactoring of Quantum and Reversible Circuits using Graph Algorithms
Martin LUKAC Saadat NURSULTAN Georgiy KRYLOV Oliver KESZOCZE Abilmansur RAKHMETTULAYEV Michitaka KAMEYAMA

Pubricized:
2024/06/24
- Summary
- Free PDF (1010KB)
IAD-Net: Single-Image Dehazing Network Based on Image Attention
Zheqing ZHANG Hao ZHOU Chuan LI Weiwei JIANG

Pubricized:
2024/06/20
- Summary
- Free PDF (9.8MB)
Improving the Accuracy of Differential-Neural Distinguisher For DES, Chaskey, and PRESENT
Liu ZHANG Zilong WANG Yindong CHEN

Pubricized:
2024/06/20
- Summary
- Free PDF (355.3KB)
Multi-Scale Contrastive Learning for Human Pose Estimation
Wenxia Bao An Lin Hua Huang Xianjun Yang Hemu Chen

Pubricized:
2024/06/17
- Summary
- Free PDF (1MB)
HDR-VDA: A Full Stage Data Augmentation Method for HDR Video Reconstruction
Fengshan ZHAO Qin LIU Takeshi IKENAGA

Pubricized:
2024/06/17
- Summary
- Free PDF (1.2MB)
Evaluating Introduction of Systems by Goal Dependency Modeling
Haruhiko KAIYA Shinpei OGATA Shinpei HAYASHI

Pubricized:
2024/06/11
- Summary
- Free PDF (1.3MB)
MISpeller: Multimodal Information Enhancement for Chinese Spelling Correction
Jiakai LI Jianyong DUAN Hao WANG Li HE Qing ZHANG

Pubricized:
2024/06/07
- Summary
- Free PDF (3.3MB)
Integrating Event Elements for Chinese-Vietnamese Cross-lingual Event Retrieval
Yuxin HUANG Yuanlin YANG Enchang ZHU Yin LIANG Yantuan XIAN

Pubricized:
2024/06/04
- Summary
- Free PDF (3.9MB)
Space-efficient FPT Algorithms for Degeneracy
Naohito MATSUMOTO Kazuhiro KURITA Masashi KIYOMI

Pubricized:
2024/05/31
- Summary
- Free PDF (101.3KB)
Learning Fast Deployment for UAV-Assisted Disaster System
Na XING Lu LI Ye ZHANG Shiyi YANG

Pubricized:
2024/05/30
- Summary
- Free PDF (10.3MB)
TDEM: Table data extraction model based on cell segmentation
Zhe Wang Zhe-Ming Lu Hao Luo Yang-Ming Zheng

Pubricized:
2024/05/30
- Summary
- Free PDF (838KB)
Reliable image matching using optimal combination of color and intensity information based on relationship with surrounding objects
Rina TAGAMI Hiroki KOBAYASHI Shuichi AKIZUKI Manabu HASHIMOTO

Pubricized:
2024/05/30
- Summary
- Free PDF (4.2MB)
The Least Core of Routing Game Without Triangle Inequality
Tomohiro KOBAYASHI Tomomi MATSUI

Pubricized:
2024/05/30
- Summary
- Free PDF (232.9KB)
Enumerating floorplans with Aligned Columns
Shin-ichi NAKANO

Pubricized:
2024/05/30
- Summary
- Free PDF (365.4KB)
A Two-Phase Algorithm for Reliable and Energy-Efficient Heterogeneous Embedded Systems
Hongzhi XU Binlian ZHANG

Pubricized:
2024/05/27
- Summary
- Free PDF (902.4KB)
Smart Contract Timestamp Vulnerability Detection Based on Code Homogeneity
Weizhi WANG Lei XIA Zhuo ZHANG Xiankai MENG

Pubricized:
2024/05/27
- Summary
- Free PDF (706KB)
Neural End-to-end Speech Translation Leveraged by ASR Posterior Distribution
Yuka KO Katsuhito SUDOH Sakriani SAKTI Satoshi NAKAMURA

Pubricized:
2024/05/24
- Summary
- Free PDF (1.5MB)
Watermarking Method with Scaling Rate Estimation Using Pilot Signal
Rinka KAWANO Masaki KAWAMURA

Pubricized:
2024/05/22
- Summary
- Free PDF (1MB)
Type-enhanced Ensemble Triple Representation via Triple-aware Attention for Cross-lingual Entity Alignment
Zhishuo ZHANG Chengxiang TAN Xueyan ZHAO Min YANG

Pubricized:
2024/05/22
- Summary
- Free PDF (5.1MB)
Joint Optimization of Task Offloading and Resource Allocation for UAV-Assisted Edge Computing: A Stackelberg Bilayer Game Approach
Peng WANG Guifen CHEN Zhiyao SUN

Pubricized:
2024/05/21
- Summary
- Free PDF (678.2KB)
EfficientNet Empowered by Dendritic Learning for Diabetic Retinopathy
Zeyuan JU Zhipeng LIU Yu GAO Haotian LI Qianhang DU Kota YOSHIKAWA Shangce GAO

Pubricized:
2024/05/20
- Summary
- Free PDF (517.4KB)
6T-8T hybrid SRAM for lower-power neural-network processing by lowering operating voltage
Ji WU Ruoxi YU Kazuteru NAMBA

Pubricized:
2024/05/20
- Summary
- Free PDF (495.1KB)
Chinese Spelling Correction Based on Knowledge Enhancement and Contrastive Learning
Hao WANG Yao Ma Jianyong Duan Li HE Xin Li

Pubricized:
2024/05/17
- Summary
- Free PDF (1.2MB)
TIG: A Multitask Temporal Interval Guided Framework for Key Frame Detection
Shijie WANG Xuejiao HU Sheng LIU Ming LI Yang LI Sidan DU

Pubricized:
2024/05/17
- Summary
- Free PDF (10.6MB)
Node-to-node and Node-to-set Disjoint Paths Problems in Bicubes
Arata KANEKO Htoo Htoo Sandi KYAW Kunihiro FUJIYOSHI Keiichi KANEKO

Pubricized:
2024/05/17
- Summary
- Free PDF (1MB)
Remote Sensing Image Dehazing Using Multi-Scale Gated Attention For Flight Simulator
Qi LIU Bo WANG Shihan TAN Shurong ZOU Wenyi GE

Pubricized:
2024/05/14
- Summary
- Free PDF (4.2MB)
Large Class Detection using GNNs: A graph based deep learning approach utilizing three typical GNN model architectures
HanYu Zhang Tomoji Kishi

Pubricized:
2024/05/14
- Summary
- Free PDF (1.4MB)
Functional Decomposition of Symmetric Multiple-Valued Functions and Their Compact Representation in Decision Diagrams
Shinobu NAGAYAMA Tsutomu SASAO Jon T. BUTLER

Pubricized:
2024/05/14
- Summary
- Free PDF (303.5KB)
Greedy selection of sensors for linear Bayesian estimation under correlated noise
Yoon Hak KIM

Pubricized:
2024/05/14
- Summary
- Free PDF (115KB)
New Bounds for Quick Computation of the Lower Bound on the Gate Count of Toffoli-Based Reversible Logic Circuits
Takashi HIRAYAMA Rin SUZUKI Katsuhisa YAMANAKA Yasuaki NISHITANI

Pubricized:
2024/05/10
- Summary
- Free PDF (222.1KB)
Evaluation of Multi-valued Data Transmission in Two-Dimensional Symbol Mapping using Linear Mixture Model
Yosuke IIJIMA Atsunori OKADA Yasushi YUMINAKA

Pubricized:
2024/05/09
- Summary
- Free PDF (8.2MB)
Using Genetic Algorithm and Mathematical Programming Model for Ambulance Location Problem in Emergency Medical Service
Batnasan Luvaanjalba Elaine Yi-Ling Wu

Pubricized:
2024/05/08
- Summary
- Free PDF (909.3KB)
Enhanced Data Transfer Cooperating with Artificial Triplets for Scene Graph Generation
KuanChao CHU Satoshi YAMAZAKI Hideki NAKAYAMA

Pubricized:
2024/04/30
- Summary
- Free PDF (9.2MB)
A mmWave sensor and camera fusion system for indoor occupancy detection and tracking
Shenglei LI Haoran LUO Tengfei SHAO Reiko HISHIYAMA

Pubricized:
2024/04/26
- Summary
- Free PDF (4.1MB)
Evaluating PAM-4 Data Transmission Quality using Multi-Dimensional Mapping of Received Symbols
Yasushi YUMINAKA Kazuharu NAKAJIMA Yosuke IIJIMA

Pubricized:
2024/04/25
- Summary
- Free PDF (7.5MB)
Unsupervised Intrusion Detection Based on Asymmetric Auto-Encoder Feature Extraction
Chunbo Liu Liyin Wang Zhikai Zhang Chunmiao Xiang Zhaojun Gu Zhi Wang Shuang Wang

Pubricized:
2024/04/25
- Summary
- Free PDF (1.2MB)
Reinforced Voxel-RCNN:An efficient 3D Object Detection Method Based on Feature Aggregation
Jia-ji JIANG Hai-bin WAN Hong-min SUN Tuan-fa QIN Zheng-qiang WANG

Pubricized:
2024/04/24
- Summary
- Free PDF (4.6MB)
A Channel Contrastive Attention-based Local-Nonlocal Mutual block on Super-Resolution
Yuhao LIU Zhenzhong CHU Lifei WEI

Pubricized:
2024/04/23
- Summary
- Free PDF (1.5MB)
Error-Tolerance-Aware Write-Energy Reduction of MTJ-Based Quantized Neural Network Hardware
Ken ASANO Masanori NATSUI Takahiro HANYU

Pubricized:
2024/04/22
- Summary
- Free PDF (2.1MB)
Skin diagnostic method using Fontana-Masson stained images of stratum corneum cells
Shuto HASEGAWA Koichiro ENOMOTO Taeko MIZUTANI Yuri OKANO Takenori TANAKA Osamu SAKAI

Pubricized:
2024/04/19
- Summary
- Free PDF (6.6MB)
Confidence-Driven Contrastive Learning for Document Classification without Annotated Data
Zhewei XU Mizuho IWAIHARA

Pubricized:
2024/04/19
- Summary
- Free PDF (2.2MB)
Delta-Sigma Domain Signal Processing Revisited with Related Topics in Stochastic Computing
Takao WAHO Akihisa KOYAMA Hitoshi HAYASHI

Pubricized:
2024/04/17
- Summary
- Free PDF (1.7MB)
Extending Binary Neural Networks to Bayesian Neural Networks with Probabilistic Interpretation of Binary Weights
Taisei SAITO Kota ANDO Tetsuya ASAI

Pubricized:
2024/04/17
- Summary
- Free PDF (1.1MB)
Unveiling Python Version Compatibility Challenges in Code Snippets on Stack Overflow
Shiyu YANG Tetsuya KANDA Daniel M. GERMAN Yoshiki HIGO

Pubricized:
2024/04/16
- Summary
- Free PDF (488.3KB)
On Easily Reconstructable Logic Functions
Tsutomu SASAO

Pubricized:
2024/04/16
- Summary
- Free PDF (240.7KB)
Tracking WebVR User Activities through Hand Motions: An Attack Perspective
Jiyeon LEE

Pubricized:
2024/04/16
- Summary
- Free PDF (3MB)
Permissionless Blockchain-Based Sybil-Resistant Self-Sovereign Identity Utilizing Attested Execution Secure Processors
Koichi MORIYAMA Akira OTSUKA

Pubricized:
2024/04/15
- Summary
- Free PDF (4.1MB)
Cross-Corpus Speech Emotion Recognition Based on Causal Emotion Information Representation
Hongliang FU Qianqian LI Huawei TAO Chunhua ZHU Yue XIE Ruxue GUO

Pubricized:
2024/04/12
- Summary
- Free PDF (8.1MB)
Investigating and Enhancing the Neural Distinguisher for Differential Cryptanalysis
Gao WANG Gaoli WANG Siwei SUN

Pubricized:
2024/04/12
- Summary
- Free PDF (1.7MB)
Nuclear Norm Minus Frobenius Norm Minimization with Rank Residual Constraint for Image Denoising
Hua HUANG Yiwen SHAN Chuan LI Zhi WANG

Pubricized:
2024/04/09
- Summary
- Free PDF (6.9MB)
Improved Just Noticeable Difference Model Based Algorithm for Fast CU Partition in V-PCC
Zhi LIU Heng WANG Yuan LI Hongyun LU Hongyuan JING Mengmeng ZHANG

Pubricized:
2024/04/05
- Summary
- Free PDF (1.1MB)
MDX-Mixer: Music Demixing by Leveraging Source Signals Separated by Existing Demixing Models
Tomoyasu NAKANO Masataka GOTO

Pubricized:
2024/04/05
- Summary
- Free PDF (2.4MB)
Machine Learning-based System for Heat-Resistant Analysis of Car Lamp Design
Hyebong CHOI Joel SHIN Jeongho KIM Samuel YOON Hyeonmin PARK Hyejin CHO Jiyoung JUNG

Pubricized:
2024/04/03
- Summary
- Free PDF (1.4MB)
Agent Allocation-Action Learning with Dynamic Heterogeneous Graph in Multi-task Games
Xianglong LI Yuan LI Jieyuan ZHANG Xinhai XU Donghong LIU

Pubricized:
2024/04/03
- Summary
- Free PDF (3.9MB)
FSAMT : Face Shape Adaptive Makeup Transfer
Haoran LUO Tengfei SHAO Shenglei LI Reiko HISHIYAMA

Pubricized:
2024/04/02
- Summary
- Free PDF (984.6KB)
Artifact Removal Using Attention Guided Local-Global Dual-Stream Network for Sparse-View CT Reconstruction
Chang SUN Yitong LIU Hongwen YANG

Pubricized:
2024/03/29
- Summary
A CNN-based feature pyramid segmentation Strategy for acoustic；scene classification
Ji XI Yue XIE Pengxu JIANG Wei JIANG

Pubricized:
2024/03/26
- Summary
An IP Core Protection Scheme Based on Hybrid Lightweight Encryption for Neuromorphic Computing System
Ming PAN

The aritcle processing charge of this paper has not been paid.

Pubricized:
2022/09/14
- Summary

Whole issue(38.6MB)

Volume E105-D No.10 (Publication Date:2022/10/01)

Special Section on Formal Approaches

FOREWORD Open Access
Shingo YAMAGUCHI

FOREWORD

Page(s):
1657-1657
- HTML
- Free PDF (60.1KB)
Finite-Horizon Optimal Spatio-Temporal Pattern Control under Spatio-Temporal Logic Specifications
Takuma KINUGAWA Toshimitsu USHIO

PAPER

Pubricized:
2022/04/08
Page(s):
1658-1664
In spatially distributed systems such as smart buildings and intelligent transportation systems, control of spatio-temporal patterns is an important issue. In this paper, we consider a finite-horizon optimal spatio-temporal pattern control problem where the pattern is specified by a signal spatio-temporal logic formula over finite traces, which will be called an SSTL_f formula. We give the syntax and Boolean semantics of SSTL_f. Then, we show linear encodings of the temporal and spatial operators used in SSTL_f and we convert the problem into a mixed integer programming problem. We illustrate the effectiveness of this proposed approach through an example of a heat system in a room.
A Characterization on Necessary Conditions of Realizability for Reactive System Specifications
Takashi TOMITA Shigeki HAGIHARA Masaya SHIMAKAWA Naoki YONEZAKI

PAPER

Pubricized:
2022/04/08
Page(s):
1665-1677
This paper focuses on verification for reactive system specifications. A reactive system is an open system that continuously interacts with an uncontrollable external environment, and it must often be highly safe and reliable. However, realizability checking for a given specification is very costly, so we need effective methods to detect and analyze defects in unrealizable specifications to refine them efficiently. We introduce a systematic characterization on necessary conditions of realizability. This characterization is based on quantifications for inputs and outputs in early and late behaviors and reveals four essential aspects of realizability: exhaustivity, strategizability, preservability and stability. Additionally, the characterization derives new necessary conditions, which enable us to classify unrealizable specifications systematically and hierarchically.

Special Section on Picture Coding and Image Media Processing

FOREWORD Open Access
Ichiro MATSUDA

FOREWORD

Page(s):
1678-1678
- HTML
- Free PDF (58.1KB)
Time-Multiplexed Coded Aperture and Coded Focal Stack -Comparative Study on Snapshot Compressive Light Field Imaging Open Access
Kohei TATEISHI Chihiro TSUTAKE Keita TAKAHASHI Toshiaki FUJII

PAPER

Pubricized:
2022/05/26
Page(s):
1679-1690
A light field (LF), which is represented as a set of dense, multi-view images, has been used in various 3D applications. To make LF acquisition more efficient, researchers have investigated compressive sensing methods by incorporating certain coding functionalities into a camera. In this paper, we focus on a challenging case called snapshot compressive LF imaging, in which an entire LF is reconstructed from only a single acquired image. To embed a large amount of LF information in a single image, we consider two promising methods based on rapid optical control during a single exposure: time-multiplexed coded aperture (TMCA) and coded focal stack (CFS), which were proposed individually in previous works. Both TMCA and CFS can be interpreted in a unified manner as extensions of the coded aperture (CA) and focal stack (FS) methods, respectively. By developing a unified algorithm pipeline for TMCA and CFS, based on deep neural networks, we evaluated their performance with respect to other possible imaging methods. We found that both TMCA and CFS can achieve better reconstruction quality than the other snapshot methods, and they also perform reasonably well compared to methods using multiple acquired images. To our knowledge, we are the first to present an overall discussion of TMCA and CFS and to compare and validate their effectiveness in the context of compressive LF imaging.
Geometric Partitioning Mode with Inter and Intra Prediction for Beyond Versatile Video Coding
Yoshitaka KIDANI Haruhisa KATO Kei KAWAMURA Hiroshi WATANABE

PAPER

Pubricized:
2022/06/21
Page(s):
1691-1703
Geometric partitioning mode (GPM) is a new inter prediction tool adopted in versatile video coding (VVC), which is the latest video coding of international standard developed by joint video expert team in 2020. Different from the regular inter prediction performed on rectangular blocks, GPM separates a coding block into two regions by the pre-defined 64 types of straight lines, generates inter predicted samples for each separated region, and then blends them to obtain the final inter predicted samples. With this feature, GPM improves the prediction accuracy at the boundary between the foreground and background with different motions. However, GPM has room to further improve the prediction accuracy if the final predicted samples can be generated using not only inter prediction but also intra prediction. In this paper, we propose a GPM with inter and intra prediction to achieve further enhanced compression capability beyond VVC. To maximize the coding performance of the proposed method, we also propose the restriction of the applicable intra prediction mode number and the prohibition of applying the intra prediction to both GPM-separated regions. The experimental results show that the proposed method improves the coding performance gain by the conventional GPM method of VVC by 1.3 times, and provides an additional coding performance gain of 1% bitrate savings in one of the coding structures for low-latency video transmission where the conventional GPM method cannot be utilized.
PPW Curves: a C² Interpolating Spline with Hyperbolic Blending of Rational Bézier Curves
Seung-Tak NOH Hiroki HARADA Xi YANG Tsukasa FUKUSATO Takeo IGARASHI

PAPER

Pubricized:
2022/05/26
Page(s):
1704-1711
It is important to consider curvature properties around the control points to produce natural-looking results in the vector illustration. C² interpolating splines satisfy point interpolation with local support. Unfortunately, they cannot control the sharpness of the segment because it utilizes trigonometric function as blending function that has no degree of freedom. In this paper, we alternate the definition of C² interpolating splines in both interpolation curve and blending function. For the interpolation curve, we adopt a rational Bézier curve that enables the user to tune the shape of curve around the control point. For the blending function, we generalize the weighting scheme of C² interpolating splines and replace the trigonometric weight to our novel hyperbolic blending function. By extending this basic definition, we can also handle exact non-C² features, such as cusps and fillets, without losing generality. In our experiment, we provide both quantitative and qualitative comparisons to existing parametric curve models and discuss the difference among them.
A Bus Crowdedness Sensing System Using Deep-Learning Based Object Detection
Wenhao HUANG Akira TSUGE Yin CHEN Tadashi OKOSHI Jin NAKAZAWA

PAPER

Pubricized:
2022/06/23
Page(s):
1712-1720
Crowdedness of buses is playing an increasingly important role in the disease control of COVID-19. The lack of a practical approach to sensing the crowdedness of buses is a major problem. This paper proposes a bus crowdedness sensing system which exploits deep learning-based object detection to count the numbers of passengers getting on and off a bus and thus estimate the crowdedness of buses in real time. In our prototype system, we combine YOLOv5s object detection model with Kalman Filter object tracking algorithm to implement a sensing algorithm running on a Jetson nano-based vehicular device mounted on a bus. By using the driving recorder video data taken from real bus, we experimentally evaluate the performance of the proposed sensing system to verify that our proposed system system improves counting accuracy and achieves real-time processing at the Jetson Nano platform.
Unrolled Network for Light Field Display
Kotaro MATSUURA Chihiro TSUTAKE Keita TAKAHASHI Toshiaki FUJII

LETTER

Pubricized:
2022/05/06
Page(s):
1721-1725
Inspired by the framework of algorithm unrolling, we propose a scalable network architecture that computes layer patterns for light field displays, enabling control of the trade-off between the display quality and the computational cost on a single pre-trained network.

Regular Section

Evaluation and Comparison of Integer Programming Solvers for Hard Real-Time Scheduling
Ana GUASQUE Patricia BALBASTRE

PAPER-Fundamentals of Information Systems

Pubricized:
2022/07/21
Page(s):
1726-1733
In order to obtain a feasible schedule of a hard real-time system, heuristic based techniques are the solution of choice. In the last few years, optimization solvers have gained attention from research communities due to their capability of handling large number of constraints. Recently, some works have used integer linear programming (ILP) for solving mono processor scheduling of real-time systems. In fact, ILP is commonly used for static scheduling of multiprocessor systems. However, two main solvers are used to solve the problem indistinctly. But, which one is the best for obtaining a schedulable system for hard real-time systems? This paper makes a comparison of two well-known optimization software packages (CPLEX and GUROBI) for the problem of finding a feasible schedule on monoprocessor hard real-time systems.
Frank-Wolfe for Sign-Constrained Support Vector Machines
Kenya TAJIMA Takahiko HENMI Tsuyoshi KATO

PAPER-Artificial Intelligence, Data Mining

Pubricized:
2022/06/27
Page(s):
1734-1742
Domain knowledge is useful to improve the generalization performance of learning machines. Sign constraints are a handy representation to combine domain knowledge with learning machine. In this paper, we consider constraining the signs of the weight coefficients in learning the linear support vector machine, and develop an optimization algorithm for minimizing the empirical risk under the sign constraints. The algorithm is based on the Frank-Wolfe method that also converges sublinearly and possesses a clear termination criterion. We show that each iteration of the Frank-Wolfe also requires O(nd+d²) computational cost. Furthermore, we derive the explicit expression for the minimal iteration number to ensure an ε-accurate solution by analyzing the curvature of the objective function. Finally, we empirically demonstrate that the sign constraints are a promising technique when similarities to the training examples compose the feature vector.
Multi-Stage Contour Primitive of Interest Extraction Network with Dense Direction Classification
Jinyan LU Quanzhen HUANG Shoubing LIU

PAPER-Artificial Intelligence, Data Mining

Pubricized:
2022/07/06
Page(s):
1743-1750
For intelligent vision measurement, the geometric image feature extraction is an essential issue. Contour primitive of interest (CPI) means a regular-shaped contour feature lying on a target object, which is widely used for geometric calculation in vision measurement and servoing. To realize that the CPI extraction model can be flexibly applied to different novel objects, the one-shot learning based CPI extraction can be implemented with deep convolutional neural network, by using only one annotated support image to guide the CPI extraction process. In this paper, we propose a multi-stage contour primitives of interest extraction network (MS-CPieNet), which uses the multi-stage strategy to improve the discrimination ability of CPI and complex background. Second, the spatial non-local attention module is utilized to enhance the deep features, by globally fusing the image features with both short and long ranges. Moreover, the dense 4-direction classification is designed to obtain the normal direction of the contour, and the directions can be further used for the contour thinning post-process. The effectiveness of the proposed methods is validated by the experiments with the OCP and ROCM datasets. A 2-D measurement experiments are conducted to demonstrate the convenient application of the proposed MS-CPieNet.
Estimation of Multiple Illuminant Colors Using Color Line Features
Quan XIU HO Takao JINNO Yusuke UCHIMI Shigeru KURIYAMA

PAPER-Image Recognition, Computer Vision

Pubricized:
2022/06/23
Page(s):
1751-1758
The colors of objects in natural images are affected by the color of lighting, and accurately estimating an illuminant's color is indispensable in analyzing scenes lit by colored lightings. Recent lighting environments enhance colorfulness due to the spread of light-emitting diode (LED) lightings whose colors are flexibly controlled in a full visible spectrum. However, existing color estimations mainly focus on the single illuminant of normal color ranges. The estimation of multiple illuminants of unusual color settings, such as blue or red of high chroma, has not been studied yet. Therefore, new color estimations should be developed for multiple illuminants of various colors. In this article, we propose a color estimation for LED lightings using Color Line features, which regards the color distribution as a straight line in a local area. This local estimate is suitable for estimating various colors of multiple illuminants. The features are sampled at many small regions in an image and aggregated to estimate a few global colors using supervised learning with a convolutional neural network. We demonstrate the higher accuracy of our method over existing ones for such colorful lighting environments by producing the image dataset lit by multiple LED lightings in a full-color range.
Sample Selection Approach with Number of False Predictions for Learning with Noisy Labels
Yuichiro NOMURA Takio KURITA

PAPER-Image Recognition, Computer Vision

Pubricized:
2022/07/21
Page(s):
1759-1768
In recent years, deep neural networks (DNNs) have made a significant impact on a variety of research fields and applications. One drawback of DNNs is that it requires a huge amount of dataset for training. Since it is very expensive to ask experts to label the data, many non-expert data collection methods such as web crawling have been proposed. However, dataset created by non-experts often contain corrupted labels, and DNNs trained on such dataset are unreliable. Since DNNs have an enormous number of parameters, it tends to overfit to noisy labels, resulting in poor generalization performance. This problem is called Learning with Noisy labels (LNL). Recent studies showed that DNNs are robust to the noisy labels in the early stage of learning before over-fitting to noisy labels because DNNs learn the simple patterns first. Therefore DNNs tend to output true labels for samples with noisy labels in the early stage of learning, and the number of false predictions for samples with noisy labels is higher than for samples with clean labels. Based on these observations, we propose a new sample selection approach for LNL using the number of false predictions. Our method periodically collects the records of false predictions during training, and select samples with a low number of false predictions from the recent records. Then our method iteratively performs sample selection and training a DNNs model using the updated dataset. Since the model is trained with more clean samples and records more accurate false predictions for sample selection, the generalization performance of the model gradually increases. We evaluated our method on two benchmark datasets, CIFAR-10 and CIFAR-100 with synthetically generated noisy labels, and the obtained results which are better than or comparative to the-state-of-the-art approaches.
A Multi-Modal Fusion Network Guided by Feature Co-Occurrence for Urban Region Function Recognition
Nenghuan ZHANG Yongbin WANG Xiaoguang WANG Peng YU

PAPER-Multimedia Pattern Processing

Pubricized:
2022/07/25
Page(s):
1769-1779
Recently, multi-modal fusion methods based on remote sensing data and social sensing data have been widely used in the field of urban region function recognition. However, due to the high complexity of noise problem, most of the existing methods are not robust enough when applied in real-world scenes, which seriously affect their application value in urban planning and management. In addition, how to extract valuable periodic feature from social sensing data still needs to be further study. To this end, we propose a multi-modal fusion network guided by feature co-occurrence for urban region function recognition, which leverages the co-occurrence relationship between multi-modal features to identify abnormal noise feature, so as to guide the fusion network to suppress noise feature and focus on clean feature. Furthermore, we employ a graph convolutional network that incorporates node weighting layer and interactive update layer to effectively extract valuable periodic feature from social sensing data. Lastly, experimental results on public available datasets indicate that our proposed method yeilds promising improvements of both accuracy and robustness over several state-of-the-art methods.
Analysis on Norms of Word Embedding and Hidden Vectors in Neural Conversational Model Based on Encoder-Decoder RNN
Manaya TOMIOKA Tsuneo KATO Akihiro TAMURA

PAPER-Natural Language Processing

Pubricized:
2022/06/30
Page(s):
1780-1789
A neural conversational model (NCM) based on an encoder-decoder recurrent neural network (RNN) with an attention mechanism learns different sequence-to-sequence mappings from what neural machine translation (NMT) learns even when based on the same technique. In the NCM, we confirmed that target-word-to-source-word mappings captured by the attention mechanism are not as clear and stationary as those for NMT. Considering that vector norms indicate a magnitude of information in the processing, we analyzed the inner workings of an encoder-decoder GRU-based NCM focusing on the norms of word embedding vectors and hidden vectors. First, we conducted correlation analyses on the norms of word embedding vectors with frequencies in the training set and with conditional entropies of a bi-gram language model to understand what is correlated with the norms in the encoder and decoder. Second, we conducted correlation analyses on norms of change in the hidden vector of the recurrent layer with their input vectors for the encoder and decoder, respectively. These analyses were done to understand how the magnitude of information propagates through the network. The analytical results suggested that the norms of the word embedding vectors are associated with their semantic information in the encoder, while those are associated with the predictability as a language model in the decoder. The analytical results further revealed how the norms propagate through the recurrent layer in the encoder and decoder.
Heterogeneous Graph Contrastive Learning for Stance Prediction
Yang LI Rui QI

PAPER-Natural Language Processing

Pubricized:
2022/07/25
Page(s):
1790-1798
Stance prediction on social media aims to infer the stances of users towards a specific topic or event, which are not expressed explicitly. It is of great significance for public opinion analysis to extract and determine users' stances using user-generated content on social media. Existing research makes use of various signals, ranging from text content to online network connections of users on these platforms. However, it lacks joint modeling of the heterogeneous information for stance prediction. In this paper, we propose a self-supervised heterogeneous graph contrastive learning framework for stance prediction in online debate forums. Firstly, we perform data augmentation on the original heterogeneous information network to generate an augmented view. The original view and augmented view are learned from a meta-path based graph encoder respectively. Then, the contrastive learning among the two views is conducted to obtain high-quality representations of users and issues. Finally, the stance prediction is accomplished by matrix factorization between users and issues. The experimental results on an online debate forum dataset show that our model outperforms other competitive baseline methods significantly.
Strengthening Network-Based Moving Target Defense with Disposable Identifiers
Taekeun PARK Keewon KIM

LETTER-Information Network

Pubricized:
2022/07/08
Page(s):
1799-1802
In this paper, we propose a scheme to strengthen network-based moving target defense with disposable identifiers. The main idea is to change disposable identifiers for each packet to maximize unpredictability with large hopping space and substantially high hopping frequency. It allows network-based moving target defense to defeat active scanning, passive scanning, and passive host profiling attacks. Experimental results show that the proposed scheme changes disposable identifiers for each packet while requiring low overhead.
Convolutional Auto-Encoder and Adversarial Domain Adaptation for Cross-Corpus Speech Emotion Recognition
Yang WANG Hongliang FU Huawei TAO Jing YANG Hongyi GE Yue XIE

LETTER-Artificial Intelligence, Data Mining

Pubricized:
2022/07/12
Page(s):
1803-1806
This letter focuses on the cross-corpus speech emotion recognition (SER) task, in which the training and testing speech signals in cross-corpus SER belong to different speech corpora. Existing algorithms are incapable of effectively extracting common sentiment information between different corpora to facilitate knowledge transfer. To address this challenging problem, a novel convolutional auto-encoder and adversarial domain adaptation (CAEADA) framework for cross-corpus SER is proposed. The framework first constructs a one-dimensional convolutional auto-encoder (1D-CAE) for feature processing, which can explore the correlation among adjacent one-dimensional statistic features and the feature representation can be enhanced by the architecture based on encoder-decoder-style. Subsequently the adversarial domain adaptation (ADA) module alleviates the feature distributions discrepancy between the source and target domains by confusing domain discriminator, and specifically employs maximum mean discrepancy (MMD) to better accomplish feature transformation. To evaluate the proposed CAEADA, extensive experiments were conducted on EmoDB, eNTERFACE, and CASIA speech corpora, and the results show that the proposed method outperformed other approaches.
End-to-End Object Separation for Threat Detection in Large-Scale X-Ray Security Images
Joanna Kazzandra DUMAGPI Yong-Jin JEONG

LETTER-Artificial Intelligence, Data Mining

Pubricized:
2022/07/25
Page(s):
1807-1811
Fine-grained image analysis, such as pixel-level approaches, improves threat detection in x-ray security images. In the practical setting, the cost of obtaining complete pixel-level annotations increases significantly, which can be reduced by partially labeling the dataset. However, handling partially labeled datasets can lead to training complicated multi-stage networks. In this paper, we propose a new end-to-end object separation framework that trains a single network on a partially labeled dataset while also alleviating the inherent class imbalance at the data and object proposal level. Empirical results demonstrate significant improvement over existing approaches.
Graph Embedding with Outlier-Robust Ratio Estimation
Kaito SATTA Hiroaki SASAKI

LETTER-Artificial Intelligence, Data Mining

Pubricized:
2022/07/04
Page(s):
1812-1816
The purpose of graph embedding is to learn a lower-dimensional embedding function for graph data. Existing methods usually rely on maximum likelihood estimation (MLE), and often learn an embedding function through conditional mean estimation (CME). However, MLE is well-known to be vulnerable to the contamination of outliers. Furthermore, CME might restrict the applicability of the graph embedding methods to a limited range of graph data. To cope with these problems, this paper proposes a novel method for graph embedding called the robust ratio graph embedding (RRGE). RRGE is based on the ratio estimation between the conditional and marginal probability distributions of link weights given data vectors, and would be applicable to a wider-range of graph data than CME-based methods. Moreover, to achieve outlier-robust estimation, the ratio is estimated with the γ-cross entropy, which is a robust alternative to the standard cross entropy. Numerical experiments on artificial data show that RRGE is robust against outliers and performs well even when CME-based methods do not work at all. Finally, the performance of the proposed method is demonstrated on realworld datasets using neural networks.
Spy in Your Eye: Spycam Attack via Open-Sided Mobile VR Device
Jiyeon LEE Kilho LEE

LETTER-Human-computer Interaction

Pubricized:
2022/07/22
Page(s):
1817-1820
Privacy violations via spy cameras are becoming increasingly serious. With the recent advent of various smart home IoT devices, such as smart TVs and robot vacuum cleaners, spycam attacks that steal users' information are being carried out in more unpredictable ways. In this paper, we introduce a new spycam attack on a mobile WebVR environment. It is performed by a web attacker who maliciously accesses the back-facing cameras of victims' mobile devices while they are browsing the attacker's WebVR site. This has the power to allow the attacker to capture victims' surroundings even at the desired field of view through sophisticated content placement in VR scenes, resulting in serious privacy breaches for mobile VR users. In this letter, we introduce a new threat facing mobile VR and show that it practically works with major browsers in a stealthy manner.
New VVC Chroma Prediction Modes Based on Coloring with Inter-Channel Correlation
Zhi LIU Jia CAO Xiaohan GUAN Mengmeng ZHANG

LETTER-Image Processing and Video Processing

Pubricized:
2022/06/27
Page(s):
1821-1824
Inter-channel correlation is one of the redundancy which need to be eliminated in video coding. In the latest video coding standard H.266/VVC, the DM (Direct Mode) and CCLM (Cross-component Linear Model) modes have been introduced to reduce the similarity between luminance and chroma. However, inter-channel correlation is still observed. In this paper, a new inter-channel prediction algorithm is proposed, which utilizes coloring principle to predict chroma pixels. From the coloring perspective, for most natural content video frames, the three components Y, U and V always demonstrate similar coloring pattern. Therefore, the U and V components can be predicted using the coloring pattern of the Y component. In the proposed algorithm, correlation coefficients are obtained in a lightweight way to describe the coloring relationship between current pixel and reference pixel in Y component, and used to predict chroma pixels. The optimal position for the reference samples is also designed. Base on the selected position of the reference samples, two new chroma prediction modes are defined. Experiment results show that, compared with VTM 12.1, the proposed algorithm has an average of -0.92% and -0.96% BD-rate improvement for U and V components, for All Intra (AI) configurations. At the same time, the increased encoding time and decoding time can be ignored.
An Efficient Multimodal Aggregation Network for Video-Text Retrieval
Zhi LIU Fangyuan ZHAO Mengmeng ZHANG

LETTER-Image Processing and Video Processing

Pubricized:
2022/06/27
Page(s):
1825-1828
In video-text retrieval task, mainstream framework consists of three parts: video encoder, text encoder and similarity calculation. MMT (Multi-modal Transformer) achieves remarkable performance for this task, however, it faces the problem of insufficient training dataset. In this paper, an efficient multimodal aggregation network for video-text retrieval is proposed. Different from the prior work using MMT to fuse video features, the NetVLAD is introduced in the proposed network. It has fewer parameters and is feasible for training with small datasets. In addition, since the function of CLIP (Contrastive Language-Image Pre-training) can be considered as learning language models from visual supervision, it is introduced as text encoder in the proposed network to avoid overfitting. Meanwhile, in order to make full use of the pre-training model, a two-step training scheme is designed. Experiments show that the proposed model achieves competitive results compared with the latest work.
Evaluating the Stability of Deep Image Quality Assessment with Respect to Image Scaling
Koki TSUBOTA Hiroaki AKUTSU Kiyoharu AIZAWA

LETTER-Image Processing and Video Processing

Pubricized:
2022/07/25
Page(s):
1829-1833
Image quality assessment (IQA) is a fundamental metric for image processing tasks (e.g., compression). With full-reference IQAs, traditional IQAs, such as PSNR and SSIM, have been used. Recently, IQAs based on deep neural networks (deep IQAs), such as LPIPS and DISTS, have also been used. It is known that image scaling is inconsistent among deep IQAs, as some perform down-scaling as pre-processing, whereas others instead use the original image size. In this paper, we show that the image scale is an influential factor that affects deep IQA performance. We comprehensively evaluate four deep IQAs on the same five datasets, and the experimental results show that image scale significantly influences IQA performance. We found that the most appropriate image scale is often neither the default nor the original size, and the choice differs depending on the methods and datasets used. We visualized the stability and found that PieAPP is the most stable among the four deep IQAs.