IEICE global.ieice.org Site

Keyword Search Result

[Keyword] Ti(30808hit)

1-20hit(30808hit)

Vision Transformer with Key-Select Routing Attention for Single Image Dehazing Open Access
Lihan TONG Weijia LI Qingxia YANG Liyuan CHEN Peng CHEN

LETTER-Image Recognition, Computer Vision

Pubricized:
2024/07/01
Vol:
E107-D No:11
Page(s):
1472-1475
We present Ksformer, utilizing Multi-scale Key-select Routing Attention (MKRA) for intelligent selection of key areas through multi-channel, multi-scale windows with a top-k operator, and Lightweight Frequency Processing Module (LFPM) to enhance high-frequency features, outperforming other dehazing methods in tests.
SH-YOLO: Small Target High Performance YOLO for Abnormal Behavior Detection in Escalator Scene Open Access
Shuoyan LIU Chao LI Yuxin LIU Yanqiu WANG

LETTER-Image Recognition, Computer Vision

Pubricized:
2024/06/26
Vol:
E107-D No:11
Page(s):
1468-1471
Escalators are an indispensable facility in public places. While they can provide convenience to people, abnormal accidents can lead to serious consequences. Yolo is a function that detects human behavior in real time. However, the model exhibits low accuracy and a high miss rate for small targets. To this end, this paper proposes the Small Target High Performance YOLO (SH-YOLO) model to detect abnormal behavior in escalators. The SH-YOLO model first enhances the backbone network through attention mechanisms. Subsequently, a small target detection layer is incorporated in order to enhance detection of key points for small objects. Finally, the conv and the SPPF are replaced with a Region Dynamic Perception Depth Separable Conv (DR-DP-Conv) and Atrous Spatial Pyramid Pooling (ASPP), respectively. The experimental results demonstrate that the proposed model is capable of accurately and robustly detecting anomalies in the real-world escalator scene.
Multimodal Speech Emotion Recognition Based on Large Language Model Open Access
Congcong FANG Yun JIN Guanlin CHEN Yunfan ZHANG Shidang LI Yong MA Yue XIE

LETTER-Speech and Hearing

Pubricized:
2024/07/22
Vol:
E107-D No:11
Page(s):
1463-1467
Currently, an increasing number of tasks in speech emotion recognition rely on the analysis of both speech and text features. However, there remains a paucity of research exploring the potential of leveraging large language models like GPT-3 to enhance emotion recognition. In this investigation, we harness the power of the GPT-3 model to extract semantic information from transcribed texts, generating text modal features with a dimensionality of 1536. Subsequently, we perform feature fusion, combining the 1536-dimensional text features with 1188-dimensional acoustic features to yield comprehensive multi-modal recognition outcomes. Our findings reveal that the proposed method achieves a weighted accuracy of 79.62% across the four emotion categories in IEMOCAP, underscoring the considerable enhancement in emotion recognition accuracy facilitated by integrating large language models.
Loss Function for Deep Learning to Model Dynamical Systems Open Access
Takahito YOSHIDA Takaharu YAGUCHI Takashi MATSUBARA

LETTER-Artificial Intelligence, Data Mining

Pubricized:
2024/07/22
Vol:
E107-D No:11
Page(s):
1458-1462
Accurately simulating physical systems is essential in various fields. In recent years, deep learning has been used to automatically build models of such systems by learning from data. One such method is the neural ordinary differential equation (neural ODE), which treats the output of a neural network as the time derivative of the system states. However, while this and related methods have shown promise, their training strategies still require further development. Inspired by error analysis techniques in numerical analysis while replacing numerical errors with modeling errors, we propose the error-analytic strategy to address this issue. Therefore, our strategy can capture long-term errors and thus improve the accuracy of long-term predictions.
CLEAR & RETURN: Stopping Run-Time Countermeasures in Cryptographic Primitives Open Access
Myung-Hyun KIM Seungkwang LEE

LETTER-Information Network

Pubricized:
2024/06/26
Vol:
E107-D No:11
Page(s):
1449-1452
White-box cryptographic implementations often use masking and shuffling as countermeasures against key extraction attacks. To counter these defenses, higher-order Differential Computation Analysis (HO-DCA) and its variants have been developed. These methods aim to breach these countermeasures without needing reverse engineering. However, these non-invasive attacks are expensive and can be thwarted by updating the masking and shuffling techniques. This paper introduces a simple binary injection attack, aptly named clear & return, designed to bypass advanced masking and shuffling defenses employed in white-box cryptography. The attack involves injecting a small amount of assembly code, which effectively disables run-time random sources. This loss of randomness exposes the unprotected lookup value within white-box implementations, making them vulnerable to simple statistical analysis. In experiments targeting open-source white-box cryptographic implementations, the attack strategy of hijacking entries in the Global Offset Table (GOT) or function calls shows effectiveness in circumventing run-time countermeasures.
Measuring Mental Workload of Software Developers Based on Nasal Skin Temperature Open Access
Keitaro NAKASAI Shin KOMEDA Masateru TSUNODA Masayuki KASHIMA

LETTER-Software Engineering

Pubricized:
2024/07/11
Vol:
E107-D No:11
Page(s):
1444-1448
To automatically measure the mental workload of developers, existing studies have used biometric measures such as brain waves and the heart rate. However, developers are often required to equip certain devices when measuring them, and can therefore be physically burdened. In this study, we evaluated the feasibility of non-contact biometric measures based on the nasal skin temperature (NST). In the experiment, the proposed biometric measures were more accurate than non-biometric measures.
Ontology Matching and Repair Based on Semantic Association and Probabilistic Logic Open Access
Nan WU Xiaocong LAI Mei CHEN Ying PAN

PAPER-Natural Language Processing

Pubricized:
2024/07/11
Vol:
E107-D No:11
Page(s):
1433-1443
With the development of the Semantic Web, an increasing number of researchers are utilizing ontology technology to construct domain ontology. Since there is no unified construction standard, ontology heterogeneity occurs. The ontology matching method can fuse heterogeneous ontologies, which realizes the interoperability between knowledge and associates to more relevant semantic information. In the case of differences between ontologies, how to reduce false matching and unsuccessful matching is a critical problem to be solved. Moreover, as the number of ontologies increases, the semantic relationship between ontologies becomes increasingly complex. Nevertheless, the current methods that solely find the similarity of names between concepts are no longer sufficient. Consequently, this paper proposes an ontology matching method based on semantic association. Accurate matching pairs are discovered by existing semantic knowledge, and then the potential semantic associations between concepts are mined according to the characteristics of the contextual structure. The matching method can better carry out matching work based on reliable knowledge. In addition, this paper introduces a probabilistic logic repair method, which can detect and repair the conflict of matching results, to enhance the availability and reliability of matching results. The experimental results show that the proposed method effectively improves the quality of matching between ontologies and saves time on repairing incorrect matching pairs. Besides, compared with the existing ontology matching systems, the proposed method has better stability.
Multi-Focus Image Fusion Algorithm Based on Multi-Task Learning and PS-ViT Open Access
Qinghua WU Weitong LI

PAPER-Image Recognition, Computer Vision

Pubricized:
2024/07/11
Vol:
E107-D No:11
Page(s):
1422-1432
Multi-focus image fusion involves combining partially focused images of the same scene to create an all-in-focus image. Aiming at the problems of existing multi-focus image fusion algorithms that the benchmark image is difficult to obtain and the convolutional neural network focuses too much on the local region, a fusion algorithm that combines local and global feature encoding is proposed. Initially, we devise two self-supervised image reconstruction tasks and train an encoder-decoder network through multi-task learning. Subsequently, within the encoder, we merge the dense connection module with the PS-ViT module, enabling the network to utilize local and global information during feature extraction. Finally, to enhance the overall efficiency of the model, distinct loss functions are applied to each task. To preserve the more robust features from the original images, spatial frequency is employed during the fusion stage to obtain the feature map of the fused image. Experimental results demonstrate that, in comparison to twelve other prominent algorithms, our method exhibits good fusion performance in objective evaluation. Ten of the selected twelve evaluation metrics show an improvement of more than 0.28%. Additionally, it presents superior visual effects subjectively.
Runtime Tests for Memory Error Handlers of In-Memory Key Value Stores Using MemFI Open Access
Naoya NEZU Hiroshi YAMADA

PAPER-Software System

Pubricized:
2024/07/11
Vol:
E107-D No:11
Page(s):
1408-1421
Modern memory devices such as DRAM are prone to errors that occur because of unintended bit flips during their operation. Since memory errors severely impact in-memory key-value stores (KVSes), software mechanisms for hardening them against memory errors are being explored. However, it is hard to efficiently test the memory error handling code due to its characteristics: the code is event-driven, the handlers depend on the memory object, and in-memory KVSes manage various objects in huge memory space. This paper presents MemFI that supports runtime tests for the memory error handlers of in-memory KVSes. Our approach performs the software fault injection of memory errors at the memory object level to trigger the target handler while smoothly carrying out tests on the same running state. To show the effectiveness of MemFI, we integrate error handling mechanisms into a real-world in-memory KVS, memcached 1.6.9 and Redis 6.2.7, and check their behavior using the MemFI prototypes. The results show that the MemFI-based runtime test allows us to check the behavior of the error handling mechanisms. We also show its efficiency by comparing it to other fault injection approaches based on a trial model.
Aggregated to Pipelined Structure Based Streaming SSN for 1-ms Superpixel Segmentation System in Factory Automation Open Access
Yuan LI Tingting HU Ryuji FUCHIKAMI Takeshi IKENAGA

PAPER-Computer System

Pubricized:
2024/07/23
Vol:
E107-D No:11
Page(s):
1396-1407
1 millisecond (1-ms) vision systems are gaining increasing attention in diverse fields like factory automation and robotics, as the ultra-low delay ensures seamless and timely responses. Superpixel segmentation is a pivotal preprocessing to reduce the number of image primitives for subsequent processing. Recently, there has been a growing emphasis on leveraging deep network-based algorithms to pursue superior performance and better integration into other deep network tasks. Superpixel Sampling Network (SSN) employs a deep network for feature generation and employs differentiable SLIC for superpixel generation. SSN achieves high performance with a small number of parameters. However, implementing SSN on FPGAs for ultra-low delay faces challenges due to the final layer’s aggregation of intermediate results. To address this limitation, this paper proposes an aggregated to pipelined structure for FPGA implementation. The final layer is decomposed into individual final layers for each intermediate result. This architectural adjustment eliminates the need for memory to store intermediate results. Concurrently, the proposed structure leverages decomposed layers to facilitate a pipelined structure with pixel streaming input to achieve ultra-low latency. To cooperate with the pipelined structure, layer-partitioned memory architecture is proposed. Each final layer has dedicated memory for storing superpixel center information, allowing values to be read and calculated from memory without conflicts. Calculation results of each final layer are accumulated, and the result of each pixel is obtained as the stream reaches the last layer. Evaluation results demonstrate that boundary recall and under-segmentation error remain comparable to SSN, with an average label consistency improvement of 0.035 over SSN. From a hardware performance perspective, the proposed system processes 1000 FPS images with a delay of 0.947 ms/frame.
BiConvNet: Integrating Spatial Details and Deep Semantic Features in a Bilateral-Branch Image Segmentation Network Open Access
Zhigang WU Yaohui ZHU

PAPER-Fundamentals of Information Systems

Pubricized:
2024/07/16
Vol:
E107-D No:11
Page(s):
1385-1395
This article focuses on improving the BiSeNet v2 bilateral branch image segmentation network structure, enhancing its learning ability for spatial details and overall image segmentation accuracy. A modified network called “BiconvNet” is proposed. Firstly, to extract shallow spatial details more effectively, a parallel concatenated strip and dilated (PCSD) convolution module is proposed and used to extract local features and surrounding contextual features in the detail branch. Continuing on, the semantic branch is reconstructed using the lightweight capability of depth separable convolution and high performance of ConvNet, in order to enable more efficient learning of deep advanced semantic features. Finally, fine-tuning is performed on the bilateral guidance aggregation layer of BiSeNet v2, enabling better fusion of the feature maps output by the detail branch and semantic branch. The experimental part discusses the contribution of stripe convolution and different sizes of empty convolution to image segmentation accuracy, and compares them with common convolutions such as Conv2d convolution, CG convolution and CCA convolution. The experiment proves that the PCSD convolution module proposed in this paper has the highest segmentation accuracy in all categories of the Cityscapes dataset compared with common convolutions. BiConvNet achieved a 9.39% accuracy improvement over the BiSeNet v2 network, with only a slight increase of 1.18M in model parameters. A mIoU accuracy of 68.75% was achieved on the validation set. Furthermore, through comparative experiments with commonly used autonomous driving image segmentation algorithms in recent years, BiConvNet demonstrates strong competitive advantages in segmentation accuracy on the Cityscapes and BDD100K datasets.
Development of Microwave-Based Renal Denervation Catheter for Clinical Application Open Access
Shohei MATSUHARA Kazuyuki SAITO Tomoyuki TAJIMA Aditya RAKHMADI Yoshiki WATANABE Nobuyoshi TAKESHITA

PAPER-Microwaves, Millimeter-Waves

Pubricized:
2024/05/20
Vol:
E107-C No:11
Page(s):
506-516
Renal Denervation (RDN) has been developed as a potential treatment for hypertension that is resistant to traditional antihypertensive medication. This technique involves the ablation of nerve fibers around the renal artery from inside the blood vessel, which is intended to suppress sympathetic nerve activity and result in an antihypertensive effect. Currently, clinical investigation is underway to evaluate the effectiveness of RDN in treating treatment-resistant hypertension. Although radio frequency (RF) ablation catheters are commonly used, their heating capacity is limited. Microwave catheters are being considered as another option for RDN. We aim to solve the technical challenges of applying microwave catheters to RDN. In this paper, we designed a catheter with a helix structure and a microwave (2.45 GHz) antenna. The antenna is a coaxial slot antenna, the dimensions of which were determined by optimizing the reflection coefficient through simulation. The measured catheter reflection coefficient is -23.6 dB using egg white and -32 dB in the renal artery. The prototype catheter was evaluated by in vitro experiments to validate the simulation. The procedure performed successfully with in vivo experiments involving the ablation of porcine renal arteries. The pathological evaluation confirmed that a large area of the perivascular tissue was ablated (> 5 mm) in a single quadrant without significant damage to the renal artery. Our proposed device allows for control of the ablation position and produces deep nerve ablation without overheating the intima or surrounding blood, suggesting a highly capable new denervation catheter.
Heart Rate Control System for Walking with Real-Time Heart Rate Prediction Open Access
Kaiji OWAKI Yusuke KANDA Hideaki KIMURA

BRIEF PAPER

Pubricized:
2024/04/23
Vol:
E107-C No:11
Page(s):
501-505
In recent years, the declining birthrate and aging population have become serious problems in Japan. To solve these problems, we have developed a system based on edge AI. This system predicts the future heart rate during walking in real time and provides feedback to improve the quality of exercise and extend healthy life expectancy. In this paper, we predicted the heart rate in real time based on the proposed system and provided feedback. Experiments were conducted without and with the predicted heart rate, and a comparison was made to demonstrate the effectiveness of the predicted heart rate.
A Simple Augmentation Method Using Cutout for Ground Penetrating Radar Image in Deep Learning Open Access
Jun SONODA Kazusa NAKAMICHI

BRIEF PAPER

Pubricized:
2024/04/26
Vol:
E107-C No:11
Page(s):
497-500
Ground penetrating radar (GPR) has the advantage of non-destructively and quickly inspecting internal structures such as voids and buried pipes under roads. However, it is necessary to estimate the internal structures from the GPR images. Recently, recognition and detection methods for GPR images using deep learning have been studied. This paper examines a data augmentation method using a cutout method necessary to estimate GPR images with deep learning accurately. We find that the cutout augmentation exhibits higher detection rates for all objects used in this study than a commonly used horizontal shift augmentation.
Numerical Dispersion Analysis of the One-Dimensional Iterated Crank-Nicolson-Based FDTD Method Open Access
Akira KAWAHARA Jun SHIBAYAMA Kazuhiro FUJITA Junji YAMAUCHI Hisamatsu NAKANO

BRIEF PAPER

Pubricized:
2024/03/01
Vol:
E107-C No:11
Page(s):
494-496
Numerical dispersion property is investigated for the finite-difference time-domain (FDTD) method based on the iterated Crank-Nicolson (ICN) scheme. The numerical dispersion relation is newly derived from the amplification matrix and its property is discussed with attention to the eigenvalue of the matrix. It is shown that the ICN-FDTD method is conditionally stable but slightly dissipative.
Fundamental Investigation of the Transient Analysis Technique for Multilayered Dispersive Media by FILT Combined with Continued Fraction Expanded Method Open Access
Kensei ITAYA Ryosuke OZAKI Tsuneki YAMASAKI

BRIEF PAPER

Pubricized:
2024/03/08
Vol:
E107-C No:11
Page(s):
490-493
In this paper, we propose the transient analysis technique to analyze the multilayered dispersive media by using a combination of fast inversion Laplace transform (FILT) and the continued fraction expanded methods. Numerical results are given by the reflection response, inside-time response waveforms, and electric field distributions of the reflection component. Further, we verify the calculation accuracy of FILT method for the two types using a convergence test.
Transient Analysis of Electromagnetic Scattering from Large-Scale Objects Using Physical Optics with Fast Inverse Laplace Transform Open Access
Seiya KISHIMOTO Ryoya OGINO Kenta ARASE Shinichiro OHNUKI

BRIEF PAPER

Pubricized:
2024/02/29
Vol:
E107-C No:11
Page(s):
486-489
This paper introduces a computational approach for transient analysis of extensive scattering problems. This novel method is based on the combination of physical optics (PO) and the fast inverse Laplace transform (FILT). PO is a technique for analyzing electromagnetic scattering from large-scale objects. We modify PO for application in the complex frequency domain, where the scattered fields are evaluated. The complex frequency function is efficiently transformed into the time domain using FILT. The effectiveness of this combination is demonstrated through large-scale analysis and transient response for a short pulse incidence. The accuracy is investigated and validated by comparison with reference solutions.
Single-Layer Circular Polarizer for Linear Polarized Horn Antenna Open Access
Ryo KUMAGAI Ryosuke SUGA Tomoki UWANO

PAPER

Pubricized:
2024/04/26
Vol:
E107-C No:11
Page(s):
479-485
In this paper, a single-layer circular polarizer for linear polarized horn antenna is proposed. The multiple reflected waves between the aperture and array provide desired phase differences between vertical and horizontal polarizations. The measured gain of the fabricated antenna is 14.4 dBic and the half power beamwidths of the vertical polarization are 28 and 24 deg. and those of the horizontal polarization are 31 and 23 degrees in the vertical and horizontal planes. The polarizer has a low impact on the gain and beamwidth of the primary horn antenna and their changes are within 1.7 dB and 10 degrees. The 3 dB fractional bandwidth of the axial ratio is measured to be 1.4%.
Convergence Characteristics of Domain Decomposition Method for Full-Wave Electromagnetic Analysis Open Access
Toshio MURAYAMA Amane TAKEI

PAPER

Pubricized:
2024/07/23
Vol:
E107-C No:11
Page(s):
465-471
A domain decomposition method is widely utilized for analyzing large-scale electromagnetic problems. The method decomposes the target model into small independent subdomains. An electromagnetic analysis has inherently suffers from late convergence analyzed with iterative algorithms such as Krylov subspace algorithms. The DDM remedies this issue by decomposing the total system into subdomain problems and gathering the local results as an interface problem to adjust to achieve the total solution. In this paper we report the convergence properties of the domain decomposition method while modifying the size of local domain and the region shape on several mesh sizes. As experimental results show, the convergence speed depends on the number of interface problem variables and the selection of the local region shapes. In addition to that the convergence property differs according to the target frequencies. In general it is demonstrated that the convergence speed can be accelerated with large cubic subdomain shape. We propose the subdomain selection strategies based on the analysis of the condition numbers of the governing equation.
Finite Element Beam Propagation Method Based on Coordinate Transformation for Optimal Design of Optical Waveguide Devices Open Access
Haonan CHEN Akito IGUCHI Yasuhide TSUJI

PAPER

Pubricized:
2024/02/05
Vol:
E107-C No:11
Page(s):
457-464
In order to calculate photonic devices with slowly varying waveguide structure along propagation direction, we develop finite element beam propagation method (FE-BPM) with coordinate transformation. In this approach, converting a longitudinally varying waveguide into the equivalent straight waveguide, cumbersome processes in FE-BPM, such as mesh updating and field interpolation processes at each propagation step, can be avoided. We employ this simulation technique in shape optimization of photonic devices and show design examples of mode converter. To show the validity of this approach, the calculated results of designed devices are compared with the finite element method (FEM) or the standard FE-BPM.

1-20hit(30808hit)

Keyword Search Result

[Keyword] Ti(30808hit)

Vision Transformer with Key-Select Routing Attention for Single Image Dehazing Open Access

SH-YOLO: Small Target High Performance YOLO for Abnormal Behavior Detection in Escalator Scene Open Access

Multimodal Speech Emotion Recognition Based on Large Language Model Open Access

Loss Function for Deep Learning to Model Dynamical Systems Open Access

CLEAR & RETURN: Stopping Run-Time Countermeasures in Cryptographic Primitives Open Access

Measuring Mental Workload of Software Developers Based on Nasal Skin Temperature Open Access

Ontology Matching and Repair Based on Semantic Association and Probabilistic Logic Open Access

Multi-Focus Image Fusion Algorithm Based on Multi-Task Learning and PS-ViT Open Access

Runtime Tests for Memory Error Handlers of In-Memory Key Value Stores Using MemFI Open Access

Aggregated to Pipelined Structure Based Streaming SSN for 1-ms Superpixel Segmentation System in Factory Automation Open Access

BiConvNet: Integrating Spatial Details and Deep Semantic Features in a Bilateral-Branch Image Segmentation Network Open Access

Development of Microwave-Based Renal Denervation Catheter for Clinical Application Open Access

Heart Rate Control System for Walking with Real-Time Heart Rate Prediction Open Access

A Simple Augmentation Method Using Cutout for Ground Penetrating Radar Image in Deep Learning Open Access

Numerical Dispersion Analysis of the One-Dimensional Iterated Crank-Nicolson-Based FDTD Method Open Access

Fundamental Investigation of the Transient Analysis Technique for Multilayered Dispersive Media by FILT Combined with Continued Fraction Expanded Method Open Access

Transient Analysis of Electromagnetic Scattering from Large-Scale Objects Using Physical Optics with Fast Inverse Laplace Transform Open Access

Single-Layer Circular Polarizer for Linear Polarized Horn Antenna Open Access

Convergence Characteristics of Domain Decomposition Method for Full-Wave Electromagnetic Analysis Open Access

Finite Element Beam Propagation Method Based on Coordinate Transformation for Optimal Design of Optical Waveguide Devices Open Access

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles