The search functionality is under construction.

Keyword Search Result

[Keyword] EE(4053hit)

21-40hit(4053hit)

  • Variable-Length Orthogonal Codes over Finite Fields Realizing Data Multiplexing and Error Correction Coding Simultaneously

    Shoichiro YAMASAKI  Tomoko K. MATSUSHIMA  Kyohei ONO  Hirokazu TANAKA  

     
    PAPER-Coding Theory and Techniques

      Pubricized:
    2023/09/26
      Vol:
    E107-A No:3
      Page(s):
    373-383

    The present study proposes a scheme in which variable-length orthogonal codes generated by combining inverse discrete Fourier transform matrices over a finite field multiplex user data into a multiplexed sequence and its sequence forms one or a plural number of codewords for Reed-Solomon coding. The proposed scheme realizes data multiplexing, error correction coding, and multi-rate transmitting at the same time. This study also shows a design example and its performance analysis of the proposed scheme.

  • Power Analysis of Floating-Point Operations for Leakage Resistance Evaluation of Neural Network Model Parameters

    Hanae NOZAKI  Kazukuni KOBARA  

     
    PAPER

      Pubricized:
    2023/09/25
      Vol:
    E107-A No:3
      Page(s):
    331-343

    In the field of machine learning security, as one of the attack surfaces especially for edge devices, the application of side-channel analysis such as correlation power/electromagnetic analysis (CPA/CEMA) is expanding. Aiming to evaluate the leakage resistance of neural network (NN) model parameters, i.e. weights and biases, we conducted a feasibility study of CPA/CEMA on floating-point (FP) operations, which are the basic operations of NNs. This paper proposes approaches to recover weights and biases using CPA/CEMA on multiplication and addition operations, respectively. It is essential to take into account the characteristics of the IEEE 754 representation in order to realize the recovery with high precision and efficiency. We show that CPA/CEMA on FP operations requires different approaches than traditional CPA/CEMA on cryptographic implementations such as the AES.

  • Simultaneous Adaptation of Acoustic and Language Models for Emotional Speech Recognition Using Tweet Data

    Tetsuo KOSAKA  Kazuya SAEKI  Yoshitaka AIZAWA  Masaharu KATO  Takashi NOSE  

     
    PAPER

      Pubricized:
    2023/12/05
      Vol:
    E107-D No:3
      Page(s):
    363-373

    Emotional speech recognition is generally considered more difficult than non-emotional speech recognition. The acoustic characteristics of emotional speech differ from those of non-emotional speech. Additionally, acoustic characteristics vary significantly depending on the type and intensity of emotions. Regarding linguistic features, emotional and colloquial expressions are also observed in their utterances. To solve these problems, we aim to improve recognition performance by adapting acoustic and language models to emotional speech. We used Japanese Twitter-based Emotional Speech (JTES) as an emotional speech corpus. This corpus consisted of tweets and had an emotional label assigned to each utterance. Corpus adaptation is possible using the utterances contained in this corpus. However, regarding the language model, the amount of adaptation data is insufficient. To solve this problem, we propose an adaptation of the language model by using online tweet data downloaded from the internet. The sentences used for adaptation were extracted from the tweet data based on certain rules. We extracted the data of 25.86 M words and used them for adaptation. In the recognition experiments, the baseline word error rate was 36.11%, whereas that with the acoustic and language model adaptation was 17.77%. The results demonstrated the effectiveness of the proposed method.

  • An Intra- and Inter-Emotion Transformer-Based Fusion Model with Homogeneous and Diverse Constraints Using Multi-Emotional Audiovisual Features for Depression Detection

    Shiyu TENG  Jiaqing LIU  Yue HUANG  Shurong CHAI  Tomoko TATEYAMA  Xinyin HUANG  Lanfen LIN  Yen-Wei CHEN  

     
    PAPER

      Pubricized:
    2023/12/15
      Vol:
    E107-D No:3
      Page(s):
    342-353

    Depression is a prevalent mental disorder affecting a significant portion of the global population, leading to considerable disability and contributing to the overall burden of disease. Consequently, designing efficient and robust automated methods for depression detection has become imperative. Recently, deep learning methods, especially multimodal fusion methods, have been increasingly used in computer-aided depression detection. Importantly, individuals with depression and those without respond differently to various emotional stimuli, providing valuable information for detecting depression. Building on these observations, we propose an intra- and inter-emotional stimulus transformer-based fusion model to effectively extract depression-related features. The intra-emotional stimulus fusion framework aims to prioritize different modalities, capitalizing on their diversity and complementarity for depression detection. The inter-emotional stimulus model maps each emotional stimulus onto both invariant and specific subspaces using individual invariant and specific encoders. The emotional stimulus-invariant subspace facilitates efficient information sharing and integration across different emotional stimulus categories, while the emotional stimulus specific subspace seeks to enhance diversity and capture the distinct characteristics of individual emotional stimulus categories. Our proposed intra- and inter-emotional stimulus fusion model effectively integrates multimodal data under various emotional stimulus categories, providing a comprehensive representation that allows accurate task predictions in the context of depression detection. We evaluate the proposed model on the Chinese Soochow University students dataset, and the results outperform state-of-the-art models in terms of concordance correlation coefficient (CCC), root mean squared error (RMSE) and accuracy.

  • The Influence of Future Perspective on Job Satisfaction and Turnover Intention of Software Engineers

    Ikuto YAMAGATA  Masateru TSUNODA  Keitaro NAKASAI  

     
    LETTER

      Pubricized:
    2023/12/08
      Vol:
    E107-D No:3
      Page(s):
    268-272

    Software development companies must consider employees' job satisfaction and turnover intentions. To explain the related factors, this study focused on future perspective index (FPI). FPI was assumed to relate positively to satisfaction and negatively to turnover. In the analysis, we compared the FPI with existing factors that are considered to be related to job satisfaction. We discovered that the FPI was promising for enhancing explanatory power, particularly when analyzing satisfaction.

  • A Combined Alignment Model for Code Search

    Juntong HONG  Eunjong CHOI  Osamu MIZUNO  

     
    PAPER

      Pubricized:
    2023/12/15
      Vol:
    E107-D No:3
      Page(s):
    257-267

    Code search is a task to retrieve the most relevant code given a natural language query. Several recent studies proposed deep learning based methods use multi-encoder model to parse code into multi-field to represent code. These methods enhance the performance of the model by distinguish between similar codes and utilizing a relation matrix to bridge the code and query. However, these models require more computational resources and parameters than single-encoder models. Furthermore, utilizing the relation matrix that solely relies on max-pooling disregards the delivery of word alignment information. To alleviate these problems, we propose a combined alignment model for code search. We concatenate the multi-code fields into one sequence to represent code and use one encoding model to encode code features. Moreover, we transform the relation matrix using trainable vectors to avoid information losses. Then, we combine intra-modal and cross-modal attention to assign the salient words while matching the corresponding code and query. Finally, we apply the attention weight to code/query embedding and compute the cosine similarity. To evaluate the performance of our model, we compare our model with six previous models on two popular datasets. The results show that our model achieves 0.614 and 0.687 Top@1 performance, outperforming the best comparison models by 12.2% and 9.3%, respectively.

  • On a Spectral Lower Bound of Treewidth

    Tatsuya GIMA  Tesshu HANAKA  Kohei NORO  Hirotaka ONO  Yota OTACHI  

     
    LETTER

      Pubricized:
    2023/06/16
      Vol:
    E107-D No:3
      Page(s):
    328-330

    In this letter, we present a new lower bound for the treewidth of a graph in terms of the second smallest eigenvalue of its Laplacian matrix. Our bound slightly improves the lower bound given by Chandran and Subramanian [Inf. Process. Lett., 87 (2003)].

  • Content-Adaptive Optimization Framework for Universal Deep Image Compression

    Koki TSUBOTA  Kiyoharu AIZAWA  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2023/10/24
      Vol:
    E107-D No:2
      Page(s):
    201-211

    While deep image compression performs better than traditional codecs like JPEG on natural images, it faces a challenge as a learning-based approach: compression performance drastically decreases for out-of-domain images. To investigate this problem, we introduce a novel task that we call universal deep image compression, which involves compressing images in arbitrary domains, such as natural images, line drawings, and comics. Furthermore, we propose a content-adaptive optimization framework to tackle this task. This framework adapts a pre-trained compression model to each target image during testing for addressing the domain gap between pre-training and testing. For each input image, we insert adapters into the decoder of the model and optimize the latent representation extracted by the encoder and the adapter parameters in terms of rate-distortion, with the adapter parameters transmitted per image. To achieve the evaluation of the proposed universal deep compression, we constructed a benchmark dataset containing uncompressed images of four domains: natural images, line drawings, comics, and vector arts. We compare our proposed method with non-adaptive and existing adaptive compression methods, and the results show that our method outperforms them. Our code and dataset are publicly available at https://github.com/kktsubota/universal-dic.

  • Universal Angle Visibility Realized by a Volumetric 3D Display Using a Rotating Mirror-Image Helix Screen Open Access

    Karin WAKATSUKI  Chiemi FUJIKAWA  Makoto OMODANI  

     
    INVITED PAPER

      Pubricized:
    2023/08/03
      Vol:
    E107-C No:2
      Page(s):
    23-28

    Herein, we propose a volumetric 3D display in which cross-sectional images are projected onto a rotating helix screen. The method employed by this display can enable image observation from universal directions. A major challenge associated with this method is the presence of invisible regions that occur depending on the observation angle. This study aimed to fabricate a mirror-image helix screen with two helical surfaces coaxially arranged in a plane-symmetrical configuration. The visible region was actually measured to be larger than the visible region of the conventional helix screen. We confirmed that the improved visible region was almost independent of the observation angle and that the visible region was almost equally wide on both the left and right sides of the rotation axis.

  • A CNN-Based Multi-Scale Pooling Strategy for Acoustic Scene Classification

    Rong HUANG  Yue XIE  

     
    LETTER-Speech and Hearing

      Pubricized:
    2023/10/17
      Vol:
    E107-D No:1
      Page(s):
    153-156

    Acoustic scene classification (ASC) is a fundamental domain within the realm of artificial intelligence classification tasks. ASC-based tasks commonly employ models based on convolutional neural networks (CNNs) that utilize log-Mel spectrograms as input for gathering acoustic features. In this paper, we designed a CNN-based multi-scale pooling (MSP) strategy for ASC. The log-Mel spectrograms are utilized as the input to CNN, which is partitioned into four frequency axis segments. Furthermore, we devised four CNN channels to acquire inputs from distinct frequency ranges. The high-level features extracted from outputs in various frequency bands are integrated through frequency pyramid average pooling layers at multiple levels. Subsequently, a softmax classifier is employed to classify different scenes. Our study demonstrates that the implementation of our designed model leads to a significant enhancement in the model's performance, as evidenced by the testing of two acoustic datasets.

  • Improved Head and Data Augmentation to Reduce Artifacts at Grid Boundaries in Object Detection

    Shinji UCHINOURA  Takio KURITA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2023/10/23
      Vol:
    E107-D No:1
      Page(s):
    115-124

    We investigated the influence of horizontal shifts of the input images for one stage object detection method. We found that the object detector class scores drop when the target object center is at the grid boundary. Many approaches have focused on reducing the aliasing effect of down-sampling to achieve shift-invariance. However, down-sampling does not completely solve this problem at the grid boundary; it is necessary to suppress the dispersion of features in pixels close to the grid boundary into adjacent grid cells. Therefore, this paper proposes two approaches focused on the grid boundary to improve this weak point of current object detection methods. One is the Sub-Grid Feature Extraction Module, in which the sub-grid features are added to the input of the classification head. The other is Grid-Aware Data Augmentation, where augmented data are generated by the grid-level shifts and are used in training. The effectiveness of the proposed approaches is demonstrated using the COCO validation set after applying the proposed method to the FCOS architecture.

  • Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis

    Kenichi FUJITA  Atsushi ANDO  Yusuke IJIMA  

     
    PAPER-Speech and Hearing

      Pubricized:
    2023/10/06
      Vol:
    E107-D No:1
      Page(s):
    93-104

    This paper proposes a speech rhythm-based method for speaker embeddings to model phoneme duration using a few utterances by the target speaker. Speech rhythm is one of the essential factors among speaker characteristics, along with acoustic features such as F0, for reproducing individual utterances in speech synthesis. A novel feature of the proposed method is the rhythm-based embeddings extracted from phonemes and their durations, which are known to be related to speaking rhythm. They are extracted with a speaker identification model similar to the conventional spectral feature-based one. We conducted three experiments, speaker embeddings generation, speech synthesis with generated embeddings, and embedding space analysis, to evaluate the performance. The proposed method demonstrated a moderate speaker identification performance (15.2% EER), even with only phonemes and their duration information. The objective and subjective evaluation results demonstrated that the proposed method can synthesize speech with speech rhythm closer to the target speaker than the conventional method. We also visualized the embeddings to evaluate the relationship between the distance of the embeddings and the perceptual similarity. The visualization of the embedding space and the relation analysis between the closeness indicated that the distribution of embeddings reflects the subjective and objective similarity.

  • Unbiased Pseudo-Labeling for Learning with Noisy Labels

    Ryota HIGASHIMOTO  Soh YOSHIDA  Takashi HORIHATA  Mitsuji MUNEYASU  

     
    LETTER

      Pubricized:
    2023/09/19
      Vol:
    E107-D No:1
      Page(s):
    44-48

    Noisy labels in training data can significantly harm the performance of deep neural networks (DNNs). Recent research on learning with noisy labels uses a property of DNNs called the memorization effect to divide the training data into a set of data with reliable labels and a set of data with unreliable labels. Methods introducing semi-supervised learning strategies discard the unreliable labels and assign pseudo-labels generated from the confident predictions of the model. So far, this semi-supervised strategy has yielded the best results in this field. However, we observe that even when models are trained on balanced data, the distribution of the pseudo-labels can still exhibit an imbalance that is driven by data similarity. Additionally, a data bias is seen that originates from the division of the training data using the semi-supervised method. If we address both types of bias that arise from pseudo-labels, we can avoid the decrease in generalization performance caused by biased noisy pseudo-labels. We propose a learning method with noisy labels that introduces unbiased pseudo-labeling based on causal inference. The proposed method achieves significant accuracy gains in experiments at high noise rates on the standard benchmarks CIFAR-10 and CIFAR-100.

  • Frameworks for Privacy-Preserving Federated Learning

    Le Trieu PHONG  Tran Thi PHUONG  Lihua WANG  Seiichi OZAWA  

     
    INVITED PAPER

      Pubricized:
    2023/09/25
      Vol:
    E107-D No:1
      Page(s):
    2-12

    In this paper, we explore privacy-preserving techniques in federated learning, including those can be used with both neural networks and decision trees. We begin by identifying how information can be leaked in federated learning, after which we present methods to address this issue by introducing two privacy-preserving frameworks that encompass many existing privacy-preserving federated learning (PPFL) systems. Through experiments with publicly available financial, medical, and Internet of Things datasets, we demonstrate the effectiveness of privacy-preserving federated learning and its potential to develop highly accurate, secure, and privacy-preserving machine learning systems in real-world scenarios. The findings highlight the importance of considering privacy in the design and implementation of federated learning systems and suggest that privacy-preserving techniques are essential in enabling the development of effective and practical machine learning systems.

  • Location and History Information Aided Efficient Initial Access Scheme for High-Speed Railway Communications

    Chang SUN  Xiaoyu SUN  Jiamin LI  Pengcheng ZHU  Dongming WANG  Xiaohu YOU  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2023/09/14
      Vol:
    E107-B No:1
      Page(s):
    214-222

    The application of millimeter wave (mmWave) directional transmission technology in high-speed railway (HSR) scenarios helps to achieve the goal of multiple gigabit data rates with low latency. However, due to the high mobility of trains, the traditional initial access (IA) scheme with high time consumption is difficult to guarantee the effectiveness of the beam alignment. In addition, the high path loss at the coverage edge of the millimeter wave remote radio unit (mmW-RRU) will also bring great challenges to the stability of IA performance. Fortunately, the train trajectory in HSR scenarios is periodic and regular. Moreover, the cell-free network helps to improve the system coverage performance. Based on these observations, this paper proposes an efficient IA scheme based on location and history information in cell-free networks, where the train can flexibly select a set of mmW-RRUs according to the received signal quality. We specifically analyze the collaborative IA process based on the exhaustive search and based on location and history information, derive expressions for IA success probability and delay, and perform the numerical analysis. The results show that the proposed scheme can significantly reduce the IA delay and effectively improve the stability of IA success probability.

  • MSLT: A Scalable Solution for Blockchain Network Transport Layer Based on Multi-Scale Node Management Open Access

    Longle CHENG  Xiaofeng LI  Haibo TAN  He ZHAO  Bin YU  

     
    PAPER-Network

      Pubricized:
    2023/09/12
      Vol:
    E107-B No:1
      Page(s):
    185-196

    Blockchain systems rely on peer-to-peer (P2P) overlay networks to propagate transactions and blocks. The node management of P2P networks affects the overall performance and reliability of the system. The traditional structure is based on random connectivity, which is known to be an inefficient operation. Therefore, we propose MSLT, a multiscale blockchain P2P network node management method to improve transaction performance. This approach involves configuring the network to operate at multiple scales, where blockchain nodes are grouped into different ranges at each scale. To minimize redundancy and manage traffic efficiently, neighboring nodes are selected from each range based on a predetermined set of rules. Additionally, a node updating method is implemented to improve the reliability of the network. Compared with existing transmission models in efficiency, utilization, and maximum transaction throughput, the MSLT node management model improves the data transmission performance.

  • Resource Allocation for Mobile Edge Computing System Considering User Mobility with Deep Reinforcement Learning

    Kairi TOKUDA  Takehiro SATO  Eiji OKI  

     
    PAPER-Network

      Pubricized:
    2023/10/06
      Vol:
    E107-B No:1
      Page(s):
    173-184

    Mobile edge computing (MEC) is a key technology for providing services that require low latency by migrating cloud functions to the network edge. The potential low quality of the wireless channel should be noted when mobile users with limited computing resources offload tasks to an MEC server. To improve the transmission reliability, it is necessary to perform resource allocation in an MEC server, taking into account the current channel quality and the resource contention. There are several works that take a deep reinforcement learning (DRL) approach to address such resource allocation. However, these approaches consider a fixed number of users offloading their tasks, and do not assume a situation where the number of users varies due to user mobility. This paper proposes Deep reinforcement learning model for MEC Resource Allocation with Dummy (DMRA-D), an online learning model that addresses the resource allocation in an MEC server under the situation where the number of users varies. By adopting dummy state/action, DMRA-D keeps the state/action representation. Therefore, DMRA-D can continue to learn one model regardless of variation in the number of users during the operation. Numerical results show that DMRA-D improves the success rate of task submission while continuing learning under the situation where the number of users varies.

  • Introduction to Compressed Sensing with Python Open Access

    Masaaki NAGAHARA  

     
    INVITED PAPER-Fundamental Theories for Communications

      Pubricized:
    2023/08/15
      Vol:
    E107-B No:1
      Page(s):
    126-138

    Compressed sensing is a rapidly growing research field in signal and image processing, machine learning, statistics, and systems control. In this survey paper, we provide a review of the theoretical foundations of compressed sensing and present state-of-the-art algorithms for solving the corresponding optimization problems. Additionally, we discuss several practical applications of compressed sensing, such as group testing, sparse system identification, and sparse feedback gain design, and demonstrate their effectiveness through Python programs. This survey paper aims to contribute to the advancement of compressed sensing research and its practical applications in various scientific disciplines.

  • Device Type Classification Based on Two-Stage Traffic Behavior Analysis Open Access

    Chikako TAKASAKI  Tomohiro KORIKAWA  Kyota HATTORI  Hidenari OHWADA  

     
    PAPER

      Pubricized:
    2023/10/17
      Vol:
    E107-B No:1
      Page(s):
    117-125

    In the beyond 5G and 6G networks, the number of connected devices and their types will greatly increase including not only user devices such as smartphones but also the Internet of Things (IoT). Moreover, Non-terrestrial networks (NTN) introduce dynamic changes in the types of connected devices as base stations or access points are moving objects. Therefore, continuous network capacity design is required to fulfill the network requirements of each device. However, continuous optimization of network capacity design for each device within a short time span becomes difficult because of the heavy calculation amount. We introduce device types as groups of devices whose traffic characteristics resemble and optimize network capacity per device type for efficient network capacity design. This paper proposes a method to classify device types by analyzing only encrypted traffic behavior without using payload and packets of specific protocols. In the first stage, general device types, such as IoT and non-IoT, are classified by analyzing packet header statistics using machine learning. Then, in the second stage, connected devices classified as IoT in the first stage are classified into IoT device types, by analyzing a time series of traffic behavior using deep learning. We demonstrate that the proposed method classifies device types by analyzing traffic datasets and outperforms the existing IoT-only device classification methods in terms of the number of types and the accuracy. In addition, the proposed model performs comparable as a state-of-the-art model of traffic classification, ResNet 1D model. The proposed method is suitable to grasp device types in terms of traffic characteristics toward efficient network capacity design in networks where massive devices for various services are connected and the connected devices continuously change.

  • Hardware-Trojan Detection at Gate-Level Netlists Using a Gradient Boosting Decision Tree Model and Its Extension Using Trojan Probability Propagation

    Ryotaro NEGISHI  Tatsuki KURIHARA  Nozomu TOGAWA  

     
    PAPER

      Pubricized:
    2023/08/16
      Vol:
    E107-A No:1
      Page(s):
    63-74

    Technological devices have become deeply embedded in people's lives, and their demand is growing every year. It has been indicated that outsourcing the design and manufacturing of integrated circuits, which are essential for technological devices, may lead to the insertion of malicious circuitry, called hardware Trojans (HTs). This paper proposes an HT detection method at gate-level netlists based on XGBoost, one of the best gradient boosting decision tree models. We first propose the optimal set of HT features among many feature candidates at a netlist level through thorough evaluations. Then, we construct an XGBoost-based HT detection method with its optimized hyperparameters. Evaluation experiments were conducted on the netlists from Trust-HUB benchmarks and showed the average F-measure of 0.842 using the proposed method. Also, we newly propose a Trojan probability propagation method that effectively corrects the HT detection results and apply it to the results obtained by XGBoost-based HT detection. Evaluation experiments showed that the average F-measure is improved to 0.861. This value is 0.194 points higher than that of the existing best method proposed so far.

21-40hit(4053hit)