The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] k(12654hit)

141-160hit(12654hit)

  • Multi-Task Learning of Japanese How-to Tip Machine Reading Comprehension by a Generative Model

    Xiaotian WANG  Tingxuan LI  Takuya TAMURA  Shunsuke NISHIDA  Takehito UTSURO  

     
    PAPER-Natural Language Processing

      Pubricized:
    2023/10/23
      Vol:
    E107-D No:1
      Page(s):
    125-134

    In the research of machine reading comprehension of Japanese how-to tip QA tasks, conventional extractive machine reading comprehension methods have difficulty in dealing with cases in which the answer string spans multiple locations in the context. The method of fine-tuning of the BERT model for machine reading comprehension tasks is not suitable for such cases. In this paper, we trained a generative machine reading comprehension model of Japanese how-to tip by constructing a generative dataset based on the website “wikihow” as a source of information. We then proposed two methods for multi-task learning to fine-tune the generative model. The first method is the multi-task learning with a generative and extractive hybrid training dataset, where both generative and extractive datasets are simultaneously trained on a single model. The second method is the multi-task learning with the inter-sentence semantic similarity and answer generation, where, drawing upon the answer generation task, the model additionally learns the distance between the sentences of the question/context and the answer in the training examples. The evaluation results showed that both of the multi-task learning methods significantly outperformed the single-task learning method in generative question-and-answer examples. Between the two methods for multi-task learning, that with the inter-sentence semantic similarity and answer generation performed the best in terms of the manual evaluation result. The data and the code are available at https://github.com/EternalEdenn/multitask_ext-gen_sts-gen.

  • Improved Head and Data Augmentation to Reduce Artifacts at Grid Boundaries in Object Detection

    Shinji UCHINOURA  Takio KURITA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2023/10/23
      Vol:
    E107-D No:1
      Page(s):
    115-124

    We investigated the influence of horizontal shifts of the input images for one stage object detection method. We found that the object detector class scores drop when the target object center is at the grid boundary. Many approaches have focused on reducing the aliasing effect of down-sampling to achieve shift-invariance. However, down-sampling does not completely solve this problem at the grid boundary; it is necessary to suppress the dispersion of features in pixels close to the grid boundary into adjacent grid cells. Therefore, this paper proposes two approaches focused on the grid boundary to improve this weak point of current object detection methods. One is the Sub-Grid Feature Extraction Module, in which the sub-grid features are added to the input of the classification head. The other is Grid-Aware Data Augmentation, where augmented data are generated by the grid-level shifts and are used in training. The effectiveness of the proposed approaches is demonstrated using the COCO validation set after applying the proposed method to the FCOS architecture.

  • Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis

    Kenichi FUJITA  Atsushi ANDO  Yusuke IJIMA  

     
    PAPER-Speech and Hearing

      Pubricized:
    2023/10/06
      Vol:
    E107-D No:1
      Page(s):
    93-104

    This paper proposes a speech rhythm-based method for speaker embeddings to model phoneme duration using a few utterances by the target speaker. Speech rhythm is one of the essential factors among speaker characteristics, along with acoustic features such as F0, for reproducing individual utterances in speech synthesis. A novel feature of the proposed method is the rhythm-based embeddings extracted from phonemes and their durations, which are known to be related to speaking rhythm. They are extracted with a speaker identification model similar to the conventional spectral feature-based one. We conducted three experiments, speaker embeddings generation, speech synthesis with generated embeddings, and embedding space analysis, to evaluate the performance. The proposed method demonstrated a moderate speaker identification performance (15.2% EER), even with only phonemes and their duration information. The objective and subjective evaluation results demonstrated that the proposed method can synthesize speech with speech rhythm closer to the target speaker than the conventional method. We also visualized the embeddings to evaluate the relationship between the distance of the embeddings and the perceptual similarity. The visualization of the embedding space and the relation analysis between the closeness indicated that the distribution of embeddings reflects the subjective and objective similarity.

  • Research on Lightweight Acoustic Scene Perception Method Based on Drunkard Methodology

    Wenkai LIU  Lin ZHANG  Menglong WU  Xichang CAI  Hongxia DONG  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/10/23
      Vol:
    E107-D No:1
      Page(s):
    83-92

    The goal of Acoustic Scene Classification (ASC) is to simulate human analysis of the surrounding environment and make accurate decisions promptly. Extracting useful information from audio signals in real-world scenarios is challenging and can lead to suboptimal performance in acoustic scene classification, especially in environments with relatively homogeneous backgrounds. To address this problem, we model the sobering-up process of “drunkards” in real-life and the guiding behavior of normal people, and construct a high-precision lightweight model implementation methodology called the “drunkard methodology”. The core idea includes three parts: (1) designing a special feature transformation module based on the different mechanisms of information perception between drunkards and ordinary people, to simulate the process of gradually sobering up and the changes in feature perception ability; (2) studying a lightweight “drunken” model that matches the normal model's perception processing process. The model uses a multi-scale class residual block structure and can obtain finer feature representations by fusing information extracted at different scales; (3) introducing a guiding and fusion module of the conventional model to the “drunken” model to speed up the sobering-up process and achieve iterative optimization and accuracy improvement. Evaluation results on the official dataset of DCASE2022 Task1 demonstrate that our baseline system achieves 40.4% accuracy and 2.284 loss under the condition of 442.67K parameters and 19.40M MAC (multiply-accumulate operations). After adopting the “drunkard” mechanism, the accuracy is improved to 45.2%, and the loss is reduced by 0.634 under the condition of 551.89K parameters and 23.6M MAC.

  • A Novel Double-Tail Generative Adversarial Network for Fast Photo Animation

    Gang LIU  Xin CHEN  Zhixiang GAO  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/09/28
      Vol:
    E107-D No:1
      Page(s):
    72-82

    Photo animation is to transform photos of real-world scenes into anime style images, which is a challenging task in AIGC (AI Generated Content). Although previous methods have achieved promising results, they often introduce noticeable artifacts or distortions. In this paper, we propose a novel double-tail generative adversarial network (DTGAN) for fast photo animation. DTGAN is the third version of the AnimeGAN series. Therefore, DTGAN is also called AnimeGANv3. The generator of DTGAN has two output tails, a support tail for outputting coarse-grained anime style images and a main tail for refining coarse-grained anime style images. In DTGAN, we propose a novel learnable normalization technique, termed as linearly adaptive denormalization (LADE), to prevent artifacts in the generated images. In order to improve the visual quality of the generated anime style images, two novel loss functions suitable for photo animation are proposed: 1) the region smoothing loss function, which is used to weaken the texture details of the generated images to achieve anime effects with abstract details; 2) the fine-grained revision loss function, which is used to eliminate artifacts and noise in the generated anime style image while preserving clear edges. Furthermore, the generator of DTGAN is a lightweight generator framework with only 1.02 million parameters in the inference phase. The proposed DTGAN can be easily end-to-end trained with unpaired training data. Extensive experiments have been conducted to qualitatively and quantitatively demonstrate that our method can produce high-quality anime style images from real-world photos and perform better than the state-of-the-art models.

  • Node-to-Set Disjoint Paths Problem in Cross-Cubes

    Rikuya SASAKI  Hiroyuki ICHIDA  Htoo Htoo Sandi KYAW  Keiichi KANEKO  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2023/10/06
      Vol:
    E107-D No:1
      Page(s):
    53-59

    The increasing demand for high-performance computing in recent years has led to active research on massively parallel systems. The interconnection network in a massively parallel system interconnects hundreds of thousands of processing elements so that they can process large tasks while communicating among others. By regarding the processing elements as nodes and the links between processing elements as edges, respectively, we can discuss various problems of interconnection networks in the framework of the graph theory. Many topologies have been proposed for interconnection networks of massively parallel systems. The hypercube is a very popular topology and it has many variants. The cross-cube is such a topology, which can be obtained by adding one extra edge to each node of the hypercube. The cross-cube reduces the diameter of the hypercube, and allows cycles of odd lengths. Therefore, we focus on the cross-cube and propose an algorithm that constructs disjoint paths from a node to a set of nodes. We give a proof of correctness of the algorithm. Also, we show that the time complexity and the maximum path length of the algorithm are O(n3 log n) and 2n - 3, respectively. Moreover, we estimate that the average execution time of the algorithm is O(n2) based on a computer experiment.

  • CQTXNet: A Modified Xception Network with Attention Modules for Cover Song Identification

    Jinsoo SEO  Junghyun KIM  Hyemi KIM  

     
    LETTER

      Pubricized:
    2023/10/02
      Vol:
    E107-D No:1
      Page(s):
    49-52

    Song-level feature summarization is fundamental for the browsing, retrieval, and indexing of digital music archives. This study proposes a deep neural network model, CQTXNet, for extracting song-level feature summary for cover song identification. CQTXNet incorporates depth-wise separable convolution, residual network connections, and attention models to extend previous approaches. An experimental evaluation of the proposed CQTXNet was performed on two publicly available cover song datasets by varying the number of network layers and the type of attention modules.

  • Frameworks for Privacy-Preserving Federated Learning

    Le Trieu PHONG  Tran Thi PHUONG  Lihua WANG  Seiichi OZAWA  

     
    INVITED PAPER

      Pubricized:
    2023/09/25
      Vol:
    E107-D No:1
      Page(s):
    2-12

    In this paper, we explore privacy-preserving techniques in federated learning, including those can be used with both neural networks and decision trees. We begin by identifying how information can be leaked in federated learning, after which we present methods to address this issue by introducing two privacy-preserving frameworks that encompass many existing privacy-preserving federated learning (PPFL) systems. Through experiments with publicly available financial, medical, and Internet of Things datasets, we demonstrate the effectiveness of privacy-preserving federated learning and its potential to develop highly accurate, secure, and privacy-preserving machine learning systems in real-world scenarios. The findings highlight the importance of considering privacy in the design and implementation of federated learning systems and suggest that privacy-preserving techniques are essential in enabling the development of effective and practical machine learning systems.

  • A Novel Trench MOS Barrier Schottky Contact Super Barrier Rectifier

    Peijian ZHANG  Kunfeng ZHU  Wensuo CHEN  

     
    PAPER-Semiconductor Materials and Devices

      Pubricized:
    2023/07/04
      Vol:
    E107-C No:1
      Page(s):
    12-17

    In this paper, a novel trench MOS barrier Schottky contact super barrier rectifier (TMB-SSBR) is proposed by combining the advantages of vertical SSBR and conventional TMBS. The operation mechanism and simulation verification are presented. TMB-SSBR consists of MOS trenches with a vertical SSBR grid which replaces the Schottky diode in the mesa of a TMBS. Due to the presence of top p-n junction in the proposed TMB-SSBR, the image force barrier lowering effect is eliminated, the pinching off electric field effect by MOS trenches is weakened, so that the mesa surface electric field is much larger than that in conventional TMBS. Therefore, the mesa width is enlarged and the n-drift concentration is slightly increased, which results in a low specific on-resistance and a good tradeoff between reverse leakage currents and forward voltages. Compared to a conventional TMBS, simulation results show that, with the same breakdown voltage of 124V and the same reverse leakage current at room temperature, TMB-SSBR increases the figure of merit (FOM, equates to VB2/Ron, sp) by 25.5%, and decreases the reverse leakage by 33.3% at the temperature of 423K. Just like the development from SBD to TMBS, from TMBS to TMB-SSBR also brings obvious improvement of performance.

  • Location and History Information Aided Efficient Initial Access Scheme for High-Speed Railway Communications

    Chang SUN  Xiaoyu SUN  Jiamin LI  Pengcheng ZHU  Dongming WANG  Xiaohu YOU  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2023/09/14
      Vol:
    E107-B No:1
      Page(s):
    214-222

    The application of millimeter wave (mmWave) directional transmission technology in high-speed railway (HSR) scenarios helps to achieve the goal of multiple gigabit data rates with low latency. However, due to the high mobility of trains, the traditional initial access (IA) scheme with high time consumption is difficult to guarantee the effectiveness of the beam alignment. In addition, the high path loss at the coverage edge of the millimeter wave remote radio unit (mmW-RRU) will also bring great challenges to the stability of IA performance. Fortunately, the train trajectory in HSR scenarios is periodic and regular. Moreover, the cell-free network helps to improve the system coverage performance. Based on these observations, this paper proposes an efficient IA scheme based on location and history information in cell-free networks, where the train can flexibly select a set of mmW-RRUs according to the received signal quality. We specifically analyze the collaborative IA process based on the exhaustive search and based on location and history information, derive expressions for IA success probability and delay, and perform the numerical analysis. The results show that the proposed scheme can significantly reduce the IA delay and effectively improve the stability of IA success probability.

  • MSLT: A Scalable Solution for Blockchain Network Transport Layer Based on Multi-Scale Node Management Open Access

    Longle CHENG  Xiaofeng LI  Haibo TAN  He ZHAO  Bin YU  

     
    PAPER-Network

      Pubricized:
    2023/09/12
      Vol:
    E107-B No:1
      Page(s):
    185-196

    Blockchain systems rely on peer-to-peer (P2P) overlay networks to propagate transactions and blocks. The node management of P2P networks affects the overall performance and reliability of the system. The traditional structure is based on random connectivity, which is known to be an inefficient operation. Therefore, we propose MSLT, a multiscale blockchain P2P network node management method to improve transaction performance. This approach involves configuring the network to operate at multiple scales, where blockchain nodes are grouped into different ranges at each scale. To minimize redundancy and manage traffic efficiently, neighboring nodes are selected from each range based on a predetermined set of rules. Additionally, a node updating method is implemented to improve the reliability of the network. Compared with existing transmission models in efficiency, utilization, and maximum transaction throughput, the MSLT node management model improves the data transmission performance.

  • Resource Allocation for Mobile Edge Computing System Considering User Mobility with Deep Reinforcement Learning

    Kairi TOKUDA  Takehiro SATO  Eiji OKI  

     
    PAPER-Network

      Pubricized:
    2023/10/06
      Vol:
    E107-B No:1
      Page(s):
    173-184

    Mobile edge computing (MEC) is a key technology for providing services that require low latency by migrating cloud functions to the network edge. The potential low quality of the wireless channel should be noted when mobile users with limited computing resources offload tasks to an MEC server. To improve the transmission reliability, it is necessary to perform resource allocation in an MEC server, taking into account the current channel quality and the resource contention. There are several works that take a deep reinforcement learning (DRL) approach to address such resource allocation. However, these approaches consider a fixed number of users offloading their tasks, and do not assume a situation where the number of users varies due to user mobility. This paper proposes Deep reinforcement learning model for MEC Resource Allocation with Dummy (DMRA-D), an online learning model that addresses the resource allocation in an MEC server under the situation where the number of users varies. By adopting dummy state/action, DMRA-D keeps the state/action representation. Therefore, DMRA-D can continue to learn one model regardless of variation in the number of users during the operation. Numerical results show that DMRA-D improves the success rate of task submission while continuing learning under the situation where the number of users varies.

  • A Survey of Information-Centric Networking: The Quest for Innovation Open Access

    Hitoshi ASAEDA  Kazuhisa MATSUZONO  Yusaku HAYAMIZU  Htet Htet HLAING  Atsushi OOKA  

     
    INVITED PAPER-Network

      Pubricized:
    2023/08/22
      Vol:
    E107-B No:1
      Page(s):
    139-153

    Information-Centric Networking (ICN) is an innovative technology that provides low-loss, low-latency, high-throughput, and high-reliability communications for diversified and advanced services and applications. In this article, we present a technical survey of ICN functionalities such as in-network caching, routing, transport, and security mechanisms, as well as recent research findings. We focus on CCNx, which is a prominent ICN protocol whose message types are defined by the Internet Research Task Force. To facilitate the development of functional code and encourage application deployment, we introduce an open-source software platform called Cefore that facilitates CCNx-based communications. Cefore consists of networking components such as packet forwarding and in-network caching daemons, and it provides APIs and a Python wrapper program that enables users to easily develop CCNx applications for on Cefore. We introduce a Mininet-based Cefore emulator and lightweight Docker containers for running CCNx experiments on Cefore. In addition to exploring ICN features and implementations, we also consider promising research directions for further innovation.

  • Introduction to Compressed Sensing with Python Open Access

    Masaaki NAGAHARA  

     
    INVITED PAPER-Fundamental Theories for Communications

      Pubricized:
    2023/08/15
      Vol:
    E107-B No:1
      Page(s):
    126-138

    Compressed sensing is a rapidly growing research field in signal and image processing, machine learning, statistics, and systems control. In this survey paper, we provide a review of the theoretical foundations of compressed sensing and present state-of-the-art algorithms for solving the corresponding optimization problems. Additionally, we discuss several practical applications of compressed sensing, such as group testing, sparse system identification, and sparse feedback gain design, and demonstrate their effectiveness through Python programs. This survey paper aims to contribute to the advancement of compressed sensing research and its practical applications in various scientific disciplines.

  • Resource-Efficient and Availability-Aware Service Chaining and VNF Placement with VNF Diversity and Redundancy

    Takanori HARA  Masahiro SASABE  Kento SUGIHARA  Shoji KASAHARA  

     
    PAPER

      Pubricized:
    2023/10/10
      Vol:
    E107-B No:1
      Page(s):
    105-116

    To establish a network service in network functions virtualization (NFV) networks, the orchestrator addresses the challenge of service chaining and virtual network function placement (SC-VNFP) by mapping virtual network functions (VNFs) and virtual links onto physical nodes and links. Unlike traditional networks, network operators in NFV networks must contend with both hardware and software failures in order to ensure resilient network services, as NFV networks consist of physical nodes and software-based VNFs. To guarantee network service quality in NFV networks, the existing work has proposed an approach for the SC-VNFP problem that considers VNF diversity and redundancy. VNF diversity splits a single VNF into multiple lightweight replica instances that possess the same functionality as the original VNF, which are then executed in a distributed manner. VNF redundancy, on the other hand, deploys backup instances with standby mode on physical nodes to prepare for potential VNF failures. However, the existing approach does not adequately consider the tradeoff between resource efficiency and service availability in the context of VNF diversity and redundancy. In this paper, we formulate the SC-VNFP problem with VNF diversity and redundancy as a two-step integer linear program (ILP) that adjusts the balance between service availability and resource efficiency. Through numerical experiments, we demonstrate the fundamental characteristics of the proposed ILP, including the tradeoff between resource efficiency and service availability.

  • Network Traffic Anomaly Detection: A Revisiting to Gaussian Process and Sparse Representation

    Yitu WANG  Takayuki NAKACHI  

     
    PAPER-Communication Theory and Signals

      Pubricized:
    2023/06/27
      Vol:
    E107-A No:1
      Page(s):
    125-133

    Seen from the Internet Service Provider (ISP) side, network traffic monitoring is an indispensable part during network service provisioning, which facilitates maintaining the security and reliability of the communication networks. Among the numerous traffic conditions, we should pay extra attention to traffic anomaly, which significantly affects the network performance. With the advancement of Machine Learning (ML), data-driven traffic anomaly detection algorithms have established high reputation due to the high accuracy and generality. However, they are faced with challenges on inefficient traffic feature extraction and high computational complexity, especially when taking the evolving property of traffic process into consideration. In this paper, we proposed an online learning framework for traffic anomaly detection by embracing Gaussian Process (GP) and Sparse Representation (SR) in two steps: 1). To extract traffic features from past records, and better understand these features, we adopt GP with a special kernel, i.e., mixture of Gaussian in the spectral domain, which makes it possible to more accurately model the network traffic for improving the performance of traffic anomaly detection. 2). To combat noise and modeling error, observing the inherent self-similarity and periodicity properties of network traffic, we manually design a feature vector, based on which SR is adopted to perform robust binary classification. Finally, we demonstrate the superiority of the proposed framework in terms of detection accuracy through simulation.

  • Low-Complexity Digital Channelizer Design for Software Defined Radio

    Jinguang HAO  Gang WANG  Honggang WANG  Lili WANG  Xuefeng LIU  

     
    PAPER-Communication Theory and Signals

      Pubricized:
    2023/07/19
      Vol:
    E107-A No:1
      Page(s):
    134-140

    In software defined radio systems, a channelizer plays an important role in extracting the desired signals from a wideband signal. Compared to the conventional methods, the proposed scheme provides a solution to design a digital channelizer extracting the multiple subband signals at different center frequencies with low complexity. To do this, this paper formulates the problem as an optimization problem, which minimizes the required multiplications number subject to the constraints of the ripple in the passbands and the stopbands for single channel and combined multiple channels. In addition, a solution to solve the optimization problem is also presented and the corresponding structure is demonstrated. Simulation results show that the proposed scheme requires smaller number of the multiplications than other conventional methods. Moreover, unlike other methods, this structure can process signals with different bandwidths at different center frequencies simultaneously only by changing the status of the corresponding multiplexers without hardware reimplementation.

  • CCTSS: The Combination of CNN and Transformer with Shared Sublayer for Detection and Classification

    Aorui GOU  Jingjing LIU  Xiaoxiang CHEN  Xiaoyang ZENG  Yibo FAN  

     
    PAPER-Image

      Pubricized:
    2023/07/06
      Vol:
    E107-A No:1
      Page(s):
    141-156

    Convolutional Neural Networks (CNNs) and Transformers have achieved remarkable performance in detection and classification tasks. Nevertheless, their feature extraction cannot consider both local and global information, so the detection and classification performance can be further improved. In addition, more and more deep learning networks are designed as more and more complex, and the amount of computation and storage space required is also significantly increased. This paper proposes a combination of CNN and transformer, and designs a local feature enhancement module and global context modeling module to enhance the cascade network. While the local feature enhancement module increases the range of feature extraction, the global context modeling is used to capture the feature maps' global information. To decrease the model complexity, a shared sublayer is designed to realize the sharing of weight parameters between the adjacent convolutional layers or cross convolutional layers, thereby reducing the number of convolutional weight parameters. Moreover, to effectively improve the detection performance of neural networks without increasing network parameters, the optimal transport assignment approach is proposed to resolve the problem of label assignment. The classification loss and regression loss are the summations of the cost between the demander and supplier. The experiment results demonstrate that the proposed Combination of CNN and Transformer with Shared Sublayer (CCTSS) performs better than the state-of-the-art methods in various datasets and applications.

  • Recent Progress in Optical Network Design and Control towards Human-Centered Smart Society Open Access

    Takashi MIYAMURA  Akira MISAWA  

     
    INVITED PAPER

      Pubricized:
    2023/09/19
      Vol:
    E107-B No:1
      Page(s):
    2-15

    In this paper, we investigate the evolution of an optical network architecture and discuss the future direction of research on optical network design and control. We review existing research on optical network design and control and present some open challenges. One of the important open challenges lies in multilayer resource optimization including IT and optical network resources. We propose an adaptive joint optimization method of IT resources and optical spectrum under time-varying traffic demand in optical networks while avoiding an increase in operation cost. We formulate the problem as mixed integer linear programming and then quantitatively evaluate the trade-off relationship between the optimality of reconfiguration and operation cost. We demonstrate that we can achieve sufficient network performance through the adaptive joint optimization while suppressing an increase in operation cost.

  • Crosstalk-Aware Resource Allocation Based on Optical Path Adjacency and Crosstalk Budget for Space Division Multiplexing Elastic Optical Networks

    Kosuke KUBOTA  Yosuke TANIGAWA  Yusuke HIROTA  Hideki TODE  

     
    PAPER

      Pubricized:
    2023/09/12
      Vol:
    E107-B No:1
      Page(s):
    27-38

    To cope with the drastic increase in traffic, space division multiplexing elastic optical networks (SDM-EONs) have been investigated. In multicore fiber environments that realize SDM-EONs, crosstalk (XT) occurs between optical paths transmitted in the same frequency slots of adjacent cores, and the quality of the optical paths is degraded by the mutual influence of XT. To solve this problem, we propose a core and spectrum assignment method that introduces the concept of prohibited frequency slots to protect the degraded optical paths. First-fit-based spectrum resource allocation algorithms, including our previous study, have the problem that only some frequency slots are used at low loads, and XT occurs even though sufficient frequency slots are available. In this study, we propose a core and spectrum assignment method that introduces the concepts of “adjacency criterion” and “XT budget” to suppress XT at low and middle loads without worsening the path blocking rate at high loads. We demonstrate the effectiveness of the proposed method in terms of the path blocking rate using computer simulations.

141-160hit(12654hit)