IEICE global.ieice.org Site

Keyword Search Result

[Keyword] Al(20498hit)

201-220hit(20498hit)

On a Spectral Lower Bound of Treewidth
Tatsuya GIMA Tesshu HANAKA Kohei NORO Hirotaka ONO Yota OTACHI

LETTER

Pubricized:
2023/06/16
Vol:
E107-D No:3
Page(s):
328-330
In this letter, we present a new lower bound for the treewidth of a graph in terms of the second smallest eigenvalue of its Laplacian matrix. Our bound slightly improves the lower bound given by Chandran and Subramanian [Inf. Process. Lett., 87 (2003)].
Dynamic Attentive Convolution for Facial Beauty Prediction
Zhishu SUN Zilong XIAO Yuanlong YU Luojun LIN

LETTER-Image Recognition, Computer Vision

Pubricized:
2023/11/07
Vol:
E107-D No:2
Page(s):
239-243
Facial Beauty Prediction (FBP) is a significant pattern recognition task that aims to achieve consistent facial attractiveness assessment with human perception. Currently, Convolutional Neural Networks (CNNs) have become the mainstream method for FBP. The training objective of most conventional CNNs is usually to learn static convolution kernels, which, however, makes the network quite difficult to capture global attentive information, and thus usually ignores the key facial regions, e.g., eyes, and nose. To tackle this problem, we devise a new convolution manner, Dynamic Attentive Convolution (DyAttenConv), which integrates the dynamic and attention mechanism into convolution in kernel-level, with the aim of enforcing the convolution kernels adapted to each face dynamically. DyAttenConv is a plug-and-play module that can be flexibly combined with existing CNN architectures, making the acquisition of the beauty-related features more globally and attentively. Extensive ablation studies show that our method is superior to other fusion and attention mechanisms, and the comparison with other state-of-the-arts also demonstrates the effectiveness of DyAttenConv on facial beauty prediction task.
A Data Augmentation Method for Fault Localization with Fault Propagation Context and VAE
Zhuo ZHANG Donghui LI Lei XIA Ya LI Xiankai MENG

LETTER-Software Engineering

Pubricized:
2023/10/25
Vol:
E107-D No:2
Page(s):
234-238
With the growing complexity and scale of software, detecting and repairing errant behaviors at an early stage are critical to reduce the cost of software development. In the practice of fault localization, a typical process usually includes three steps: execution of input domain test cases, construction of model domain test vectors and suspiciousness evaluation. The effectiveness of model domain test vectors is significant for locating the faulty code. However, test vectors with failing labels usually account for a small portion, which inevitably degrades the effectiveness of fault localization. In this paper, we propose a data augmentation method PVaug by using fault propagation context and variational autoencoder (VAE). Our empirical results on 14 programs illustrate that PVaug has promoted the effectiveness of fault localization.
Understanding File System Operations of a Secure Container Runtime Using System Call Tracing Technique
Sunwoo JANG Young-Kyoon SUH Byungchul TAK

LETTER-Software System

Pubricized:
2023/11/01
Vol:
E107-D No:2
Page(s):
229-233
This letter presents a technique that observes system call mapping behavior of the proxy kernel layer of secure container runtimes. We applied it to file system operations of a secure container runtime, gVisor. We found that gVisor's operations can become more expensive than the native by 48× more syscalls for open, and 6× for read and write.
Rotation-Invariant Convolution Networks with Hexagon-Based Kernels
Yiping TANG Kohei HATANO Eiji TAKIMOTO

PAPER-Biocybernetics, Neurocomputing

Pubricized:
2023/11/15
Vol:
E107-D No:2
Page(s):
220-228
We introduce the Hexagonal Convolutional Neural Network (HCNN), a modified version of CNN that is robust against rotation. HCNN utilizes a hexagonal kernel and a multi-block structure that enjoys more degrees of rotation information sharing than standard convolution layers. Our structure is easy to use and does not affect the original tissue structure of the network. We achieve the complete rotational invariance on the recognition task of simple pattern images and demonstrate better performance on the recognition task of the rotated MNIST images, synthetic biomarker images and microscopic cell images than past methods, where the robustness to rotation matters.
BRsyn-Caps: Chinese Text Classification Using Capsule Network Based on Bert and Dependency Syntax
Jie LUO Chengwan HE Hongwei LUO

PAPER-Natural Language Processing

Pubricized:
2023/11/06
Vol:
E107-D No:2
Page(s):
212-219
Text classification is a fundamental task in natural language processing, which finds extensive applications in various domains, such as spam detection and sentiment analysis. Syntactic information can be effectively utilized to improve the performance of neural network models in understanding the semantics of text. The Chinese text exhibits a high degree of syntactic complexity, with individual words often possessing multiple parts of speech. In this paper, we propose BRsyn-caps, a capsule network-based Chinese text classification model that leverages both Bert and dependency syntax. Our proposed approach integrates semantic information through Bert pre-training model for obtaining word representations, extracts contextual information through Long Short-term memory neural network (LSTM), encodes syntactic dependency trees through graph attention neural network, and utilizes capsule network to effectively integrate features for text classification. Additionally, we propose a character-level syntactic dependency tree adjacency matrix construction algorithm, which can introduce syntactic information into character-level representation. Experiments on five datasets demonstrate that BRsyn-caps can effectively integrate semantic, sequential, and syntactic information in text, proving the effectiveness of our proposed method for Chinese text classification.
Content-Adaptive Optimization Framework for Universal Deep Image Compression
Koki TSUBOTA Kiyoharu AIZAWA

PAPER-Image Processing and Video Processing

Pubricized:
2023/10/24
Vol:
E107-D No:2
Page(s):
201-211
While deep image compression performs better than traditional codecs like JPEG on natural images, it faces a challenge as a learning-based approach: compression performance drastically decreases for out-of-domain images. To investigate this problem, we introduce a novel task that we call universal deep image compression, which involves compressing images in arbitrary domains, such as natural images, line drawings, and comics. Furthermore, we propose a content-adaptive optimization framework to tackle this task. This framework adapts a pre-trained compression model to each target image during testing for addressing the domain gap between pre-training and testing. For each input image, we insert adapters into the decoder of the model and optimize the latent representation extracted by the encoder and the adapter parameters in terms of rate-distortion, with the adapter parameters transmitted per image. To achieve the evaluation of the proposed universal deep compression, we constructed a benchmark dataset containing uncompressed images of four domains: natural images, line drawings, comics, and vector arts. We compare our proposed method with non-adaptive and existing adaptive compression methods, and the results show that our method outperforms them. Our code and dataset are publicly available at https://github.com/kktsubota/universal-dic.
RR-Row: Redirect-on-Write Based Virtual Machine Disk for Record/Replay
Ying ZHAO Youquan XIAN Yongnan LI Peng LIU Dongcheng LI

PAPER-Data Engineering, Web Information Systems

Pubricized:
2023/11/06
Vol:
E107-D No:2
Page(s):
169-179
Record/replay is one essential tool in clouds to provide many capabilities such as fault tolerance, software debugging, and security analysis by recording the execution into a log and replaying it deterministically later on. However, in virtualized environments, the log file increases heavily due to saving a considerable amount of I/O data, finally introducing significant storage costs. To mitigate this problem, this paper proposes RR-Row, a redirect-on-write based virtual machine disk for record/replay scenarios. RR-Row appends the written data into new blocks rather than overwrites the original blocks during normal execution so that all written data are reserved in the disk. In this way, the record system only saves the block id instead of the full content, and the replay system can directly fetch the data from the disk rather than the log, thereby reducing the log size a lot. In addition, we propose several optimizations for improving I/O performance so that it is also suitable for normal execution. We implement RR-Row for QEMU and conduct a set of experiments. The results show that RR-Row reduces the log size by 68% compared to the currently used Raw/QCow2 disk without compromising I/O performance.
Invisible Digital Image by Thin-Film Interference of Niobium Oxide Using Its Periodic Repeatability Open Access
Shuichi MAEDA Akihiro FUKAMI Kaiki YAMAZAKI

INVITED PAPER

Pubricized:
2023/08/22
Vol:
E107-C No:2
Page(s):
42-46
There are several benefits of the information that is invisible to the human eye. “Invisible” here means that it can be visualized or quantified when using instruments. For example, it can improve security without compromising product design. We have succeeded in making an invisible digital image on a metal substrate using periodic repeatability by thin-film interference of niobium oxides. Although this digital information is invisible in the visible light wavelength range of 400-800nm, but detectable in the infrared light that of 800-1150nm. This technology has a potential to be applied to anti-counterfeiting and traceability.
Electrically Controllable Light Scattering Properties of Nematic Liquid Crystal/Polyfluorene Gel Devices Open Access
Asuka YAGI Michinori HONMA Ryota ITO Toshiaki NOSE

INVITED PAPER

Pubricized:
2023/08/10
Vol:
E107-C No:2
Page(s):
29-33
In recent years, demand for smart windows with dimming and other functions has been increasing, e.g., polymer dispersed liquid crystals. Liquid crystal (LC) gels also have the potential for smart glass applications owing to their light-scattering properties. In this study, LC gels were prepared by mixing nematic LC (E7) with poly(9,9-di-n-octylfluorenyl-2,7-diyl) (PFO) as a gelator. The LC gel formed a dense PFO network as the concentration increased. The PFO network structure changed in response to the change in the cooling rate. High contrast ratio of light scattering was obtained for the LC gel device that was fabricated via the 2-wt%-doping of PFO and natural cooling. Furthermore, the PFO concentration and cooling rate were found to affect the response time of the LC gel device.
Interdigital and Multi-Via Structures for Mushroom-Type Metasurface Reflectors
Taisei URAKAMI Tamami MARUYAMA Shimpei NISHIYAMA Manato KUSAMIZU Akira ONO Takahiro SHIOZAWA

PAPER-Antennas and Propagation

Vol:
E107-B No:2
Page(s):
309-320
The novel patch element shapes with the interdigital and multi-via structures for mushroom-type metasurface reflectors are proposed for controlling the reflection phases. The interdigital structure provides a wide reflection phase range by changing the depth of the interdigital fingers. In addition, the multi-via structure provides the higher positive reflection phases such as near +180°. The sufficient reflection phase range of 360° and the low polarization dependent properties could be confirmed by the electromagnetic field simulation. The metasurface reflector for the normal incident plane wave was designed. The desired reflection angles and sharp far field patterns of the reflected beams could be confirmed in the simulation results. The prototype reflectors for the experiments should be designed in the same way as the primary reflector design of the reflector antenna. Specifically, the reflector design method based on the ray tracing method using the incident wave phase was proposed for the prototype. The experimental radiation pattern for the reflector antenna composed of the transmitting antenna (TX) and the prototype metasurface reflector was similar to the simulated radiation pattern. The effectiveness of the proposed structures and their design methods could be confirmed by these simulation and experiment results.
An Adaptive Energy-Efficient Uneven Clustering Routing Protocol for WSNs
Mingyu LI Jihang YIN Yonggang XU Gang HUA Nian XU

PAPER-Network

Vol:
E107-B No:2
Page(s):
296-308
Aiming at the problem of “energy hole” caused by random distribution of nodes in large-scale wireless sensor networks (WSNs), this paper proposes an adaptive energy-efficient balanced uneven clustering routing protocol (AEBUC) for WSNs. The competition radius is adaptively adjusted based on the node density and the distance from candidate cluster head (CH) to base station (BS) to achieve scale-controlled adaptive optimal clustering; in candidate CHs, the energy relative density and candidate CH relative density are comprehensively considered to achieve dynamic CH selection. In the inter-cluster communication, based on the principle of energy balance, the relay communication cost function is established and combined with the minimum spanning tree method to realize the optimized inter-cluster multi-hop routing, forming an efficient communication routing tree. The experimental results show that the protocol effectively saves network energy, significantly extends network lifetime, and better solves the “energy hole” problem.
Development and Photoluminescence Properties of Dinuclear Eu(III)-β-Diketonates with a Branched Tetraphosphine Tetraoxide Ligand for Potential Use in LEDs as Red Phosphors Open Access
Hiroki IWANAGA Fumihiko AIGA Shin-ichi SASAOKA Takahiro WAZAKI

INVITED PAPER

Pubricized:
2023/08/03
Vol:
E107-C No:2
Page(s):
34-41
In the field of micro-LED displays consisting of UV or Blue-LED arrays and phosphors, where the chips used are very small, particle size of phosphors must be small to suppress variation in hue for each pixel. Especially, there is a strong demand for a red phosphor with small particle sizes. However, quantum yields of inorganic phosphors decrease as particles size of phosphors get smaller. On the other hand, in the case of organic phosphors and complexes, quantum yields don't decrease when particle size gets smaller because each molecule has a function of absorbing and emitting light. We focus on Eu(III) complexes as candidates of red phosphors for micro-LED displays because their color purities of photoluminescence spectra are high, and have been tried to enhance photoluminescence intensity by coordinating non-ionic ligand, specifically, newly designed phosphine oxide ligands. Non-ionic ligands have generally less influential on properties of complexes compared with ionic ligands, but have a high degree of flexibility in molecular design. We found novel molecular design concept of phosphine oxide ligands to enhance photoluminescence properties of Eu(III) complexes. This time, novel dinuclear Eu(III)-β-diketonates with a branched tetraphosphine tetraoxide ligand, TDPBPO and TDPPPO, were developed. They are designed to have two different phosphine oxide portions; one has aromatic substituents and the other has no aromatic substituent. TDPBPO and TDPPPO ligands have functions of increasing absolute quantum yields of Eu(III)-β-diketonates. Eu(III)-β-diketonates with branched tetraphosphine tetraoxide ligands have sharp red emissions and excellent quantum yields, and are promising candidates for micro LED displays, security media, and sensing for their pure and strong photoluminescence intensity.
Robust Visual Tracking Using Hierarchical Vision Transformer with Shifted Windows Multi-Head Self-Attention
Peng GAO Xin-Yue ZHANG Xiao-Li YANG Jian-Cheng NI Fei WANG

LETTER-Image Recognition, Computer Vision

Pubricized:
2023/10/20
Vol:
E107-D No:1
Page(s):
161-164
Despite Siamese trackers attracting much attention due to their scalability and efficiency in recent years, researchers have ignored the background appearance, which leads to their inapplicability in recognizing arbitrary target objects with various variations, especially in complex scenarios with background clutter and distractors. In this paper, we present a simple yet effective Siamese tracker, where the shifted windows multi-head self-attention is produced to learn the characteristics of a specific given target object for visual tracking. To validate the effectiveness of our proposed tracker, we use the Swin Transformer as the backbone network and introduced an auxiliary feature enhancement network. Extensive experimental results on two evaluation datasets demonstrate that the proposed tracker outperforms other baselines.
A CNN-Based Multi-Scale Pooling Strategy for Acoustic Scene Classification
Rong HUANG Yue XIE

LETTER-Speech and Hearing

Pubricized:
2023/10/17
Vol:
E107-D No:1
Page(s):
153-156
Acoustic scene classification (ASC) is a fundamental domain within the realm of artificial intelligence classification tasks. ASC-based tasks commonly employ models based on convolutional neural networks (CNNs) that utilize log-Mel spectrograms as input for gathering acoustic features. In this paper, we designed a CNN-based multi-scale pooling (MSP) strategy for ASC. The log-Mel spectrograms are utilized as the input to CNN, which is partitioned into four frequency axis segments. Furthermore, we devised four CNN channels to acquire inputs from distinct frequency ranges. The high-level features extracted from outputs in various frequency bands are integrated through frequency pyramid average pooling layers at multiple levels. Subsequently, a softmax classifier is employed to classify different scenes. Our study demonstrates that the implementation of our designed model leads to a significant enhancement in the model's performance, as evidenced by the testing of two acoustic datasets.
Improved Head and Data Augmentation to Reduce Artifacts at Grid Boundaries in Object Detection
Shinji UCHINOURA Takio KURITA

PAPER-Image Recognition, Computer Vision

Pubricized:
2023/10/23
Vol:
E107-D No:1
Page(s):
115-124
We investigated the influence of horizontal shifts of the input images for one stage object detection method. We found that the object detector class scores drop when the target object center is at the grid boundary. Many approaches have focused on reducing the aliasing effect of down-sampling to achieve shift-invariance. However, down-sampling does not completely solve this problem at the grid boundary; it is necessary to suppress the dispersion of features in pixels close to the grid boundary into adjacent grid cells. Therefore, this paper proposes two approaches focused on the grid boundary to improve this weak point of current object detection methods. One is the Sub-Grid Feature Extraction Module, in which the sub-grid features are added to the input of the classification head. The other is Grid-Aware Data Augmentation, where augmented data are generated by the grid-level shifts and are used in training. The effectiveness of the proposed approaches is demonstrated using the COCO validation set after applying the proposed method to the FCOS architecture.
Efficient Action Spotting Using Saliency Feature Weighting
Yuzhi SHI Takayoshi YAMASHITA Tsubasa HIRAKAWA Hironobu FUJIYOSHI Mitsuru NAKAZAWA Yeongnam CHAE Björn STENGER

PAPER-Image Processing and Video Processing

Pubricized:
2023/10/17
Vol:
E107-D No:1
Page(s):
105-114
Action spotting is a key component in high-level video understanding. The large number of similar frames poses a challenge for recognizing actions in videos. In this paper we use frame saliency to represent the importance of frames for guiding the model to focus on keyframes. We propose the frame saliency weighting module to improve frame saliency and video representation at the same time. Our proposed model contains two encoders, for pre-action and post-action time windows, to encode video context. We validate our design choices and the generality of proposed method in extensive experiments. On the public SoccerNet-v2 dataset, the method achieves an average mAP of 57.3%, improving over the state of the art. Using embedding features obtained from multiple feature extractors, the average mAP further increases to 75%. We show that reducing the model size by over 90% does not significantly impact performance. Additionally, we use ablation studies to prove the effective of saliency weighting module. Further, we show that our frame saliency weighting strategy is applicable to existing methods on more general action datasets, such as SoccerNet-v1, ActivityNet v1.3, and UCF101.
Research on Lightweight Acoustic Scene Perception Method Based on Drunkard Methodology
Wenkai LIU Lin ZHANG Menglong WU Xichang CAI Hongxia DONG

PAPER-Artificial Intelligence, Data Mining

Pubricized:
2023/10/23
Vol:
E107-D No:1
Page(s):
83-92
The goal of Acoustic Scene Classification (ASC) is to simulate human analysis of the surrounding environment and make accurate decisions promptly. Extracting useful information from audio signals in real-world scenarios is challenging and can lead to suboptimal performance in acoustic scene classification, especially in environments with relatively homogeneous backgrounds. To address this problem, we model the sobering-up process of “drunkards” in real-life and the guiding behavior of normal people, and construct a high-precision lightweight model implementation methodology called the “drunkard methodology”. The core idea includes three parts: (1) designing a special feature transformation module based on the different mechanisms of information perception between drunkards and ordinary people, to simulate the process of gradually sobering up and the changes in feature perception ability; (2) studying a lightweight “drunken” model that matches the normal model's perception processing process. The model uses a multi-scale class residual block structure and can obtain finer feature representations by fusing information extracted at different scales; (3) introducing a guiding and fusion module of the conventional model to the “drunken” model to speed up the sobering-up process and achieve iterative optimization and accuracy improvement. Evaluation results on the official dataset of DCASE2022 Task1 demonstrate that our baseline system achieves 40.4% accuracy and 2.284 loss under the condition of 442.67K parameters and 19.40M MAC (multiply-accumulate operations). After adopting the “drunkard” mechanism, the accuracy is improved to 45.2%, and the loss is reduced by 0.634 under the condition of 551.89K parameters and 23.6M MAC.
A Novel Double-Tail Generative Adversarial Network for Fast Photo Animation
Gang LIU Xin CHEN Zhixiang GAO

PAPER-Artificial Intelligence, Data Mining

Pubricized:
2023/09/28
Vol:
E107-D No:1
Page(s):
72-82
Photo animation is to transform photos of real-world scenes into anime style images, which is a challenging task in AIGC (AI Generated Content). Although previous methods have achieved promising results, they often introduce noticeable artifacts or distortions. In this paper, we propose a novel double-tail generative adversarial network (DTGAN) for fast photo animation. DTGAN is the third version of the AnimeGAN series. Therefore, DTGAN is also called AnimeGANv3. The generator of DTGAN has two output tails, a support tail for outputting coarse-grained anime style images and a main tail for refining coarse-grained anime style images. In DTGAN, we propose a novel learnable normalization technique, termed as linearly adaptive denormalization (LADE), to prevent artifacts in the generated images. In order to improve the visual quality of the generated anime style images, two novel loss functions suitable for photo animation are proposed: 1) the region smoothing loss function, which is used to weaken the texture details of the generated images to achieve anime effects with abstract details; 2) the fine-grained revision loss function, which is used to eliminate artifacts and noise in the generated anime style image while preserving clear edges. Furthermore, the generator of DTGAN is a lightweight generator framework with only 1.02 million parameters in the inference phase. The proposed DTGAN can be easily end-to-end trained with unpaired training data. Extensive experiments have been conducted to qualitatively and quantitatively demonstrate that our method can produce high-quality anime style images from real-world photos and perform better than the state-of-the-art models.
Node-to-Set Disjoint Paths Problem in Cross-Cubes
Rikuya SASAKI Hiroyuki ICHIDA Htoo Htoo Sandi KYAW Keiichi KANEKO

PAPER-Fundamentals of Information Systems

Pubricized:
2023/10/06
Vol:
E107-D No:1
Page(s):
53-59
The increasing demand for high-performance computing in recent years has led to active research on massively parallel systems. The interconnection network in a massively parallel system interconnects hundreds of thousands of processing elements so that they can process large tasks while communicating among others. By regarding the processing elements as nodes and the links between processing elements as edges, respectively, we can discuss various problems of interconnection networks in the framework of the graph theory. Many topologies have been proposed for interconnection networks of massively parallel systems. The hypercube is a very popular topology and it has many variants. The cross-cube is such a topology, which can be obtained by adding one extra edge to each node of the hypercube. The cross-cube reduces the diameter of the hypercube, and allows cycles of odd lengths. Therefore, we focus on the cross-cube and propose an algorithm that constructs disjoint paths from a node to a set of nodes. We give a proof of correctness of the algorithm. Also, we show that the time complexity and the maximum path length of the algorithm are O(n3 log n) and 2n - 3, respectively. Moreover, we estimate that the average execution time of the algorithm is O(n2) based on a computer experiment.

201-220hit(20498hit)

Keyword Search Result

[Keyword] Al(20498hit)

On a Spectral Lower Bound of Treewidth

Dynamic Attentive Convolution for Facial Beauty Prediction

A Data Augmentation Method for Fault Localization with Fault Propagation Context and VAE

Understanding File System Operations of a Secure Container Runtime Using System Call Tracing Technique

Rotation-Invariant Convolution Networks with Hexagon-Based Kernels

BRsyn-Caps: Chinese Text Classification Using Capsule Network Based on Bert and Dependency Syntax

Content-Adaptive Optimization Framework for Universal Deep Image Compression

RR-Row: Redirect-on-Write Based Virtual Machine Disk for Record/Replay

Invisible Digital Image by Thin-Film Interference of Niobium Oxide Using Its Periodic Repeatability Open Access

Electrically Controllable Light Scattering Properties of Nematic Liquid Crystal/Polyfluorene Gel Devices Open Access

Interdigital and Multi-Via Structures for Mushroom-Type Metasurface Reflectors

An Adaptive Energy-Efficient Uneven Clustering Routing Protocol for WSNs

Development and Photoluminescence Properties of Dinuclear Eu(III)-β-Diketonates with a Branched Tetraphosphine Tetraoxide Ligand for Potential Use in LEDs as Red Phosphors Open Access

Robust Visual Tracking Using Hierarchical Vision Transformer with Shifted Windows Multi-Head Self-Attention

A CNN-Based Multi-Scale Pooling Strategy for Acoustic Scene Classification

Improved Head and Data Augmentation to Reduce Artifacts at Grid Boundaries in Object Detection

Efficient Action Spotting Using Saliency Feature Weighting

Research on Lightweight Acoustic Scene Perception Method Based on Drunkard Methodology

A Novel Double-Tail Generative Adversarial Network for Fast Photo Animation

Node-to-Set Disjoint Paths Problem in Cross-Cubes

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles