The search functionality is under construction.
The search functionality is under construction.

IEICE TRANSACTIONS on Information

  • Impact Factor

    0.59

  • Eigenfactor

    0.002

  • article influence

    0.1

  • Cite Score

    1.4

Advance publication (published online immediately after acceptance)

Volume E107-D No.10  (Publication Date:2024/10/01)

    Regular Section
  • A Two-Phase Algorithm for Reliable and Energy-Efficient Heterogeneous Embedded Systems Open Access

    Hongzhi XU  Binlian ZHANG  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2024/05/27
      Page(s):
    1285-1296

    Reliability is an important figure of merit of the system and it must be satisfied in safety-critical applications. This paper considers parallel applications on heterogeneous embedded systems and proposes a two-phase algorithm framework to minimize energy consumption for satisfying applications’ reliability requirement. The first phase is for initial assignment and the second phase is for either satisfying the reliability requirement or improving energy efficiency. Specifically, when the application’s reliability requirement cannot be achieved via the initial assignment, an algorithm for enhancing the reliability of tasks is designed to satisfy the application’s reliability requirement. Considering that the reliability of initial assignment may exceed the application’s reliability requirement, an algorithm for reducing the execution frequency of tasks is designed to improve energy efficiency. The proposed algorithms are compared with existing algorithms by using real parallel applications. Experimental results demonstrate that the proposed algorithms consume less energy while satisfying the application’s reliability requirements.

  • Evaluating Introduction of Systems by Goal Dependency Modeling Open Access

    Haruhiko KAIYA  Shinpei OGATA  Shinpei HAYASHI  

     
    PAPER-Software Engineering

      Pubricized:
    2024/06/11
      Page(s):
    1297-1311

    Before introducing systems to an activity in a business or in daily life, the effects of these systems should first be carefully examined by analysts. Thus, methods for examining such effects are required at the early stage of requirements analysis. In this study, we propose and evaluate an analysis method using a modeling notation for this purpose, called goal dependency modeling and analysis (GDMA). In an activity, an actor, such as a person or a system, expects a goal to be achieved. The actor or another actor will achieve this goal. We focus herein on such a goal and the two different roles played by the actors. In GDMA, the dependencies in the roles of the two actors about a goal are mainly represented. GDMA enables analysts to observe the change of actors, their expectations, and abilities by using metrics. Each metric is defined on the basis of the GDMA meta-model. Therefore, GDMA enables them to decide whether the change is good or bad both quantitatively and qualitatively for the people. We evaluate GDMA by describing models of the actual system introduction written in the literatures and explain the effects caused by this introduction. In addition, CASE tools are crucial in efficiently and accurately performing GDMA. Hence, we develop its tools by extending an existing UML modeling tool.

  • Reliable Image Matching Using Optimal Combination of Color and Intensity Information Based on Relationship with Surrounding Objects Open Access

    Rina TAGAMI  Hiroki KOBAYASHI  Shuichi AKIZUKI  Manabu HASHIMOTO  

     
    PAPER-Pattern Recognition

      Pubricized:
    2024/05/30
      Page(s):
    1312-1321

    Due to the revitalization of the semiconductor industry and efforts to reduce labor and unmanned operations in the retail and food manufacturing industries, objects to be recognized at production sites are increasingly diversified in color and design. Depending on the target objects, it may be more reliable to process only color information, while intensity information may be better, or a combination of color and intensity information may be better. However, there are not many conventional method for optimizing the color and intensity information to be used, and deep learning is too costly for production sites. In this paper, we optimize the combination of the color and intensity information of a small number of pixels used for matching in the framework of template matching, on the basis of the mutual relationship between the target object and surrounding objects. We propose a fast and reliable matching method using these few pixels. Pixels with a low pixel pattern frequency are selected from color and grayscale images of the target object, and pixels that are highly discriminative from surrounding objects are carefully selected from these pixels. The use of color and intensity information makes the method highly versatile for object design. The use of a small number of pixels that are not shared by the target and surrounding objects provides high robustness to the surrounding objects and enables fast matching. Experiments using real images have confirmed that when 14 pixels are used for matching, the processing time is 6.3 msec and the recognition success rate is 99.7%. The proposed method also showed better positional accuracy than the comparison method, and the optimized pixels had a higher recognition success rate than the non-optimized pixels.

  • Neural End-To-End Speech Translation Leveraged by ASR Posterior Distribution Open Access

    Yuka KO  Katsuhito SUDOH  Sakriani SAKTI  Satoshi NAKAMURA  

     
    PAPER-Speech and Hearing

      Pubricized:
    2024/05/24
      Page(s):
    1322-1331

    End-to-end speech translation (ST) directly renders source language speech to the target language without intermediate automatic speech recognition (ASR) output as in a cascade approach. End-to-end ST avoids error propagation from intermediate ASR results. Although recent attempts have applied multi-task learning using an auxiliary task of ASR to improve ST performance, they use cross-entropy loss to one-hot references in the ASR task, and the trained ST models do not consider possible ASR confusion. In this study, we propose a novel multi-task learning framework for end-to-end STs leveraged by ASR-based loss against posterior distributions obtained using a pre-trained ASR model called ASR posterior-based loss (ASR-PBL). The ASR-PBL method, which enables a ST model to reflect possible ASR confusion among competing hypotheses with similar pronunciations, can be applied to one of the strong multi-task ST baseline models with Hybrid CTC/Attention ASR task loss. In our experiments on the Fisher Spanish-to-English corpus, the proposed method demonstrated better BLEU results than the baseline that used standard CE loss.

  • Multi-Scale Contrastive Learning for Human Pose Estimation Open Access

    Wenxia BAO  An LIN  Hua HUANG  Xianjun YANG  Hemu CHEN  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2024/06/17
      Page(s):
    1332-1341

    Recent years have seen remarkable progress in human pose estimation. However, manual annotation of keypoints remains tedious and imprecise. To alleviate this problem, this paper proposes a novel method called Multi-Scale Contrastive Learning (MSCL). This method uses a siamese network structure with upper and lower branches that capture diffirent views of the same image. Each branch uses a backbone network to extract image representations, employing multi-scale feature vectors to capture information. These feature vectors are then passed through an enhanced feature pyramid for fusion, producing more robust feature representations. The feature vectors are then further encoded by mapping and prediction heads to predict the feature vector of another view. Using negative cosine similarity between vectors as a loss function, the backbone network is pre-trained on a large-scale unlabeled dataset, enhancing its capacity to extract visual representations. Finally, transfer learning is performed on a small amount of labelled data for the pose estimation task. Experiments on COCO datasets show significant improvements in Average Precision (AP) of 1.8%, 0.9%, and 1.2% with 1%, 5%, and 10% labelled data on COCO. In addition, the Percentage of Correct Keypoints (PCK) improves by 0.5% on MPII&AIC, outperforming mainstream contrastive learning methods.

  • MISpeller: Multimodal Information Enhancement for Chinese Spelling Correction Open Access

    Jiakai LI  Jianyong DUAN  Hao WANG  Li HE  Qing ZHANG  

     
    PAPER-Natural Language Processing

      Pubricized:
    2024/06/07
      Page(s):
    1342-1352

    Chinese spelling correction is a foundational task in natural language processing that aims to detect and correct spelling errors in text. Most spelling corrections in Chinese used multimodal information to model the relationship between incorrect and correct characters. However, feature information mismatch occured during fusion result from the different sources of features, causing the importance relationships between different modalities to be ignored, which in turn restricted the model from learning in an efficient manner. To this end, this paper proposes a multimodal language model-based Chinese spelling corrector, named as MISpeller. The method, based on ChineseBERT as the basic model, allows the comprehensive capture and fusion of character semantic information, phonetic information and graphic information in a single model without the need to construct additional neural networks, and realises the phenomenon of unequal fusion of multi-feature information. In addition, in order to solve the overcorrection issues, the replication mechanism is further introduced, and the replication factor is used as the dynamic weight to efficiently fuse the multimodal information. The model is able to control the proportion of original characters and predicted characters according to different input texts, and it can learn more specifically where errors occur. Experiments conducted on the SIGHAN benchmark show that the proposed model achieves the state-of-the-art performance of the F1 score at the correction level by an average of 4.36%, which validates the effectiveness of the model.

  • Integrating Event Elements for Chinese-Vietnamese Cross-Lingual Event Retrieval Open Access

    Yuxin HUANG  Yuanlin YANG  Enchang ZHU  Yin LIANG  Yantuan XIAN  

     
    PAPER-Natural Language Processing

      Pubricized:
    2024/06/04
      Page(s):
    1353-1361

    Chinese-Vietnamese cross-lingual event retrieval aims to retrieve the Vietnamese sentence describing the same event as a given Chinese query sentence from a set of Vietnamese sentences. Existing mainstream cross-lingual event retrieval methods rely on extracting textual representations from query texts and calculating their similarity with textual representations in other language candidate sets. However, these methods ignore the difference in event elements present during Chinese-Vietnamese cross-language retrieval. Consequently, sentences with similar meanings but different event elements may be incorrectly considered to describe the same event. To address this problem, we propose a cross-lingual retrieval method that integrates event elements. We introduce event elements as an additional supervisory signal, where we calculate the semantic similarity of event elements in two sentences using an attention mechanism to determine the attention score of the event elements. This allows us to establish a one-to-one correspondence between event elements in the text. Additionally, we leverage the multilingual pre-trained language model fine-tuned based on contrastive learning to obtain cross-language sentence representation to calculate the semantic similarity of the sentence texts. By combining these two approaches, we obtain the final text similarity score. Experimental results demonstrate that our proposed method achieves higher retrieval accuracy than the baseline model.

  • Smart Contract Timestamp Vulnerability Detection Based on Code Homogeneity Open Access

    Weizhi WANG  Lei XIA  Zhuo ZHANG  Xiankai MENG  

     
    LETTER-Software Engineering

      Pubricized:
    2024/05/27
      Page(s):
    1362-1366

    Smart contracts, as a form of digital protocol, are computer programs designed for the automatic execution, control, and recording of contractual terms. They permit transactions to be conducted without the need for an intermediary. However, the economic property of smart contracts makes their vulnerabilities susceptible to hacking attacks, leading to significant losses. In this paper, we introduce a smart contract timestamp vulnerability detection technique HomoDec based on code homogeneity. The core idea of this technique involves comparing the homogeneity between the code of the test smart contract and the existing smart contract vulnerability codes in the database to determine whether the tested code has a timestamp vulnerability. Specifically, HomoDec first explores how to vectorize smart contracts reasonably and efficiently, representing smart contract code as a high-dimensional vector containing features of code vulnerabilities. Subsequently, it investigates methods to determine the homogeneity between the test codes and the ones in vulnerability code base, enabling the detection of potential timestamp vulnerabilities in smart contract code.

  • Learning Fast Deployment for UAV-Assisted Disaster System Open Access

    Na XING  Lu LI  Ye ZHANG  Shiyi YANG  

     
    LETTER-Information Network

      Pubricized:
    2024/05/30
      Page(s):
    1367-1371

    Unmanned aerial vehicle (UAV)-assisted systems have attracted a lot of attention due to its high probability of line-of-sight (LoS) connections and flexible deployment. In this paper, we aim to minimize the upload time required for the UAV to collect information from the sensor nodes in disaster scenario, while optimizing the deployment position of UAV. In order to get the deployment solution quickly, a data-driven approach is proposed in which an optimization strategy acts as the expert. Considering that images could capture the spatial configurations well, we use a convolutional neural network (CNN) to learn how to place the UAV. In the end, the simulation results demonstrate the effectiveness and generalization of the proposed method. After training, our CNN can generate UAV configuration faster than the general optimization-based algorithm.

  • Differential-Neural Cryptanalysis on AES Open Access

    Liu ZHANG  Zilong WANG  Jinyu LU  

     
    LETTER-Information Network

      Pubricized:
    2024/06/20
      Page(s):
    1372-1375

    Based on the framework of a multi-stage key recovery attack for a large block cipher, 2 and 3-round differential-neural distinguishers were trained for AES using partial ciphertext bits. The study introduces the differential characteristics employed for the 2-round ciphertext pairs and explores the reasons behind the near 100% accuracy of the 2-round differential neural distinguisher. Utilizing the trained 2-round distinguisher, the 3-round subkey of AES is successfully recovered through a multi-stage key guessing. Additionally, a complexity analysis of the attack is provided, validating the effectiveness of the proposed method.

  • TDEM: Table Data Extraction Model Based on Cell Segmentation Open Access

    Zhe WANG  Zhe-Ming LU  Hao LUO  Yang-Ming ZHENG  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2024/05/30
      Page(s):
    1376-1379

    To accurately extract tabular data, we propose a novel cell-based tabular data extraction model (TDEM). The key of TDEM is to utilize grayscale projection of row separation lines, coupled with table masks and column masks generated by the VGG-19 neural network, to segment each individual cell from the input image of the table. In this way, the text content of the table is extracted from a specific single cell, which greatly improves the accuracy of table recognition.

  • IAD-Net: Single-Image Dehazing Network Based on Image Attention Open Access

    Zheqing ZHANG  Hao ZHOU  Chuan LI  Weiwei JIANG  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2024/06/20
      Page(s):
    1380-1384

    Single-image dehazing is a challenging task in computer vision research. Aiming at the limitations of traditional convolutional neural network representation capabilities and the high computational overhead of the self-attention mechanism in recent years, we proposed image attention and designed a single image dehazing network based on the image attention: IAD-Net. The proposed image attention is a plug-and-play module with the ability of global modeling. IAD-Net is a parallel network structure that combines the global modeling ability of image attention and the local modeling ability of convolution, so that the network can learn global and local features. The proposed network model has excellent feature learning ability and feature expression ability, has low computational overhead, and also improves the detail information of hazy images. Experiments verify the effectiveness of the image attention module and the competitiveness of IAD-Net with state-of-the-art methods.