The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] net(6055hit)

61-80hit(6055hit)

  • Implementing Optical Analog Computing and Electrooptic Hopfield Network by Silicon Photonic Circuits Open Access

    Guangwei CONG  Noritsugu YAMAMOTO  Takashi INOUE  Yuriko MAEGAMI  Morifumi OHNO  Shota KITA  Rai KOU  Shu NAMIKI  Koji YAMADA  

     
    INVITED PAPER

      Pubricized:
    2024/01/05
      Vol:
    E107-A No:5
      Page(s):
    700-708

    Wide deployment of artificial intelligence (AI) is inducing exponentially growing energy consumption. Traditional digital platforms are becoming difficult to fulfill such ever-growing demands on energy efficiency as well as computing latency, which necessitates the development of high efficiency analog hardware platforms for AI. Recently, optical and electrooptic hybrid computing is reactivated as a promising analog hardware alternative because it can accelerate the information processing in an energy-efficient way. Integrated photonic circuits offer such an analog hardware solution for implementing photonic AI and machine learning. For this purpose, we proposed a photonic analog of support vector machine and experimentally demonstrated low-latency and low-energy classification computing, which evidences the latency and energy advantages of optical analog computing over traditional digital computing. We also proposed an electrooptic Hopfield network for classifying and recognizing time-series data. This paper will review our work on implementing classification computing and Hopfield network by leveraging silicon photonic circuits.

  • Enhancing Speech Quality in Air Traffic Control Communication Using DIUnet_V-Based Speech Enhancement Techniques Open Access

    Haijun LIANG  Yukun LI  Jianguo KONG  Qicong HAN  Chengyu YU  

     
    PAPER-Speech and Hearing

      Pubricized:
    2023/12/11
      Vol:
    E107-D No:4
      Page(s):
    551-558

    Air Traffic Control (ATC) communication suffers from issues such as high electromagnetic interference, fast speech rate, and low intelligibility, which pose challenges for downstream tasks like Automatic Speech Recognition (ASR). This article aims to research how to enhance the audio quality and intelligibility of civil aviation speech through speech enhancement methods, thereby improving the accuracy of speech recognition and providing support for the digitalization of civil aviation. We propose a speech enhancement model called DIUnet_V (DenseNet & Inception & U-Net & Volume) that combines both time-frequency and time-domain methods to effectively handle the specific characteristics of civil aviation speech, such as predominant electromagnetic interference and fast speech rate. For model evaluation, we assess the denoising and enhancement effects using three metrics: Signal-to-Noise Ratio (SNR), Mean Opinion Score (MOS), and speech recognition error rate. On a simulated ATC training recording dataset, DIUnet_Volume10 achieved an SNR value of 7.3861, showing a 4.5663 improvement compared to the original U-net model. To address the challenge of the absence of clean speech in the ATC working environment, which makes it difficult to accurately calculate SNR, we propose evaluating the denoising effects indirectly based on the recognition performance of an ATC speech recognition system. On a real ATC speech dataset, the average word error rate decreased by 1.79% absolute and the average sentence error rate decreased by 3% absolute for DIUnet_V processed speech compared to the unprocessed speech in the built speech recognition system.

  • Mining User Activity Patterns from Time-Series Data Obtained from UWB Sensors in Indoor Environments Open Access

    Muhammad FAWAD RAHIM  Tessai HAYAMA  

     
    PAPER

      Pubricized:
    2023/12/19
      Vol:
    E107-D No:4
      Page(s):
    459-467

    In recent years, location-based technologies for ubiquitous environments have aimed to realize services tailored to each purpose based on information about an individual's current location. To establish such advanced location-based services, an estimation technology that can accurately recognize and predict the movements of people and objects is necessary. Although global positioning system (GPS) has already been used as a standard for outdoor positioning technology and many services have been realized, several techniques using conventional wireless sensors such as Wi-Fi, RFID, and Bluetooth have been considered for indoor positioning technology. However, conventional wireless indoor positioning is prone to the effects of noise, and the large range of estimated indoor locations makes it difficult to identify human activities precisely. We propose a method to mine user activity patterns from time-series data of user's locationss in an indoor environment using ultra-wideband (UWB) sensors. An UWB sensor is useful for indoor positioning due to its high noise immunity and measurement accuracy, however, to our knowledge, estimation and prediction of human indoor activities using UWB sensors have not yet been addressed. The proposed method consists of three steps: 1) obtaining time-series data of the user's location using a UWB sensor attached to the user, and then estimating the areas where the user has stayed; 2) associating each area of the user's stay with a nearby landmark of activity and assigning indoor activities; and 3) mining the user's activity patterns based on the user's indoor activities and their transitions. We conducted experiments to evaluate the proposed method by investigating the accuracy of estimating the user's area of stay using a UWB sensor and observing the results of activity pattern mining applied to actual laboratory members over 30-days. The results showed that the proposed method is superior to a comparison method, Time-based clustering algorithm, in estimating the stay areas precisely, and that it is possible to reveal the user's activity patterns appropriately in the actual environment.

  • Pattern-Based Meta Graph Neural Networks for Argument Classifications Open Access

    Shiyao DING  Takayuki ITO  

     
    PAPER

      Pubricized:
    2023/12/11
      Vol:
    E107-D No:4
      Page(s):
    451-458

    Despite recent advancements in utilizing meta-learning for addressing the generalization challenges of graph neural networks (GNN), their performance in argumentation mining tasks, such as argument classifications, remains relatively limited. This is primarily due to the under-utilization of potential pattern knowledge intrinsic to argumentation structures. To address this issue, our study proposes a two-stage, pattern-based meta-GNN method in contrast to conventional pattern-free meta-GNN approaches. Initially, our method focuses on learning a high-level pattern representation to effectively capture the pattern knowledge within an argumentation structure and then predicts edge types. It then utilizes a meta-learning framework in the second stage, designed to train a meta-learner based on the predicted edge types. This feature allows for rapid generalization to novel argumentation graphs. Through experiments on real English discussion datasets spanning diverse topics, our results demonstrate that our proposed method substantially outperforms conventional pattern-free GNN approaches, signifying a significant stride forward in this domain.

  • Overfitting Problem of ANN- and VSTF-Based Nonlinear Equalizers Trained on Repeated Random Bit Sequences Open Access

    Kai IKUTA  Jinya NAKAMURA  Moriya NAKAMURA  

     
    PAPER-Fiber-Optic Transmission for Communications

      Vol:
    E107-B No:4
      Page(s):
    349-356

    In this paper, we investigated the overfitting characteristics of nonlinear equalizers based on an artificial neural network (ANN) and the Volterra series transfer function (VSTF), which were designed to compensate for optical nonlinear waveform distortion in optical fiber communication systems. Linear waveform distortion caused by, e.g., chromatic dispersion (CD) is commonly compensated by linear equalizers using digital signal processing (DSP) in digital coherent receivers. However, mitigation of nonlinear waveform distortion is considered to be one of the next important issues. An ANN-based nonlinear equalizer is one possible candidate for solving this problem. However, the risk of overfitting of ANNs is one obstacle in using the technology in practical applications. We evaluated and compared the overfitting of ANN- and conventional VSTF-based nonlinear equalizers used to compensate for optical nonlinear distortion. The equalizers were trained on repeated random bit sequences (RRBSs), while varying the length of the bit sequences. When the number of hidden-layer units of the ANN was as large as 100 or 1000, the overfitting characteristics were comparable to those of the VSTF. However, when the number of hidden-layer units was 10, which is usually enough to compensate for optical nonlinear distortion, the overfitting was weaker than that of the VSTF. Furthermore, we confirmed that even commonly used finite impulse response (FIR) filters showed overfitting to the RRBS when the length of the RRBS was equal to or shorter than the length of the tapped delay line of the filters. Conversely, when the RRBS used for the training was sufficiently longer than the tapped delay line, the overfitting could be suppressed, even when using an ANN-based nonlinear equalizer with 10 hidden-layer units.

  • Noise-Robust Scream Detection Using Wave-U-Net Open Access

    Noboru HAYASAKA  Riku KASAI  Takuya FUTAGAMI  

     
    LETTER

      Pubricized:
    2023/10/05
      Vol:
    E107-A No:4
      Page(s):
    634-637

    In this paper, we propose a noise-robust scream detection method with the aim of expanding the scream detection system, a sound-based security system. The proposed method uses enhanced screams using Wave-U-Net, which was effective as a noise reduction method for noisy screams. However, the enhanced screams showed different frequency components from clean screams and erroneously emphasized frequency components similar to scream in noise. Therefore, Wave-U-Net was applied even in the process of training Gaussian mixture models, which are discriminators. We conducted detection experiments using the proposed method in various noise environments and determined that the false acceptance rate was reduced by an average of 2.1% or more compared with the conventional method.

  • Constraints and Evaluations on Signature Transmission Interval for Aggregate Signatures with Interactive Tracing Functionality Open Access

    Ryu ISHII  Kyosuke YAMASHITA  Zihao SONG  Yusuke SAKAI  Tadanori TERUYA  Takahiro MATSUDA  Goichiro HANAOKA  Kanta MATSUURA  Tsutomu MATSUMOTO  

     
    PAPER

      Pubricized:
    2023/10/10
      Vol:
    E107-A No:4
      Page(s):
    619-633

    Fault-tolerant aggregate signature (FT-AS) is a special type of aggregate signature that is equipped with the functionality for tracing signers who generated invalid signatures in the case an aggregate signature is detected as invalid. In existing FT-AS schemes (whose tracing functionality requires multi-rounds), a verifier needs to send a feedback to an aggregator for efficiently tracing the invalid signer(s). However, in practice, if this feedback is not responded to the aggregator in a sufficiently fast and timely manner, the tracing process will fail. Therefore, it is important to estimate whether this feedback can be responded and received in time on a real system. In this work, we measure the total processing time required for the feedback by implementing an existing FT-AS scheme, and evaluate whether the scheme works without problems in real systems. Our experimental results show that the time required for the feedback is 605.3 ms for a typical parameter setting, which indicates that if the acceptable feedback time is significantly larger than a few hundred ms, the existing FT-AS scheme would effectively work in such systems. However, there are situations where such feedback time is not acceptable, in which case the existing FT-AS scheme cannot be used. Therefore, we further propose a novel FT-AS scheme that does not require any feedback. We also implement our new scheme and show that a feedback in this scheme is completely eliminated but the size of its aggregate signature (affecting the communication cost from the aggregator to the verifier) is 144.9 times larger than that of the existing FT-AS scheme (with feedbacks) for a typical parameter setting, and thus has a trade-off between the feedback waiting time and the communication cost from the verifier to the aggregator with the existing FT-AS scheme.

  • Power Analysis of Floating-Point Operations for Leakage Resistance Evaluation of Neural Network Model Parameters

    Hanae NOZAKI  Kazukuni KOBARA  

     
    PAPER

      Pubricized:
    2023/09/25
      Vol:
    E107-A No:3
      Page(s):
    331-343

    In the field of machine learning security, as one of the attack surfaces especially for edge devices, the application of side-channel analysis such as correlation power/electromagnetic analysis (CPA/CEMA) is expanding. Aiming to evaluate the leakage resistance of neural network (NN) model parameters, i.e. weights and biases, we conducted a feasibility study of CPA/CEMA on floating-point (FP) operations, which are the basic operations of NNs. This paper proposes approaches to recover weights and biases using CPA/CEMA on multiplication and addition operations, respectively. It is essential to take into account the characteristics of the IEEE 754 representation in order to realize the recovery with high precision and efficiency. We show that CPA/CEMA on FP operations requires different approaches than traditional CPA/CEMA on cryptographic implementations such as the AES.

  • Ensemble Malware Classifier Considering PE Section Information

    Ren TAKEUCHI  Rikima MITSUHASHI  Masakatsu NISHIGAKI  Tetsushi OHKI  

     
    PAPER

      Pubricized:
    2023/09/19
      Vol:
    E107-A No:3
      Page(s):
    306-318

    The war between cyber attackers and security analysts is gradually intensifying. Owing to the ease of obtaining and creating support tools, recent malware continues to diversify into variants and new species. This increases the burden on security analysts and hinders quick analysis. Identifying malware families is crucial for efficiently analyzing diversified malware; thus, numerous low-cost, general-purpose, deep-learning-based classification techniques have been proposed in recent years. Among these methods, malware images that represent binary features as images are often used. However, no models or architectures specific to malware classification have been proposed in previous studies. Herein, we conduct a detailed analysis of the behavior and structure of malware and focus on PE sections that capture the unique characteristics of malware. First, we validate the features of each PE section that can distinguish malware families. Then, we identify PE sections that contain adequate features to classify families. Further, we propose an ensemble learning-based classification method that combines features of highly discriminative PE sections to improve classification accuracy. The validation of two datasets confirms that the proposed method improves accuracy over the baseline, thereby emphasizing its importance.

  • Simultaneous Adaptation of Acoustic and Language Models for Emotional Speech Recognition Using Tweet Data

    Tetsuo KOSAKA  Kazuya SAEKI  Yoshitaka AIZAWA  Masaharu KATO  Takashi NOSE  

     
    PAPER

      Pubricized:
    2023/12/05
      Vol:
    E107-D No:3
      Page(s):
    363-373

    Emotional speech recognition is generally considered more difficult than non-emotional speech recognition. The acoustic characteristics of emotional speech differ from those of non-emotional speech. Additionally, acoustic characteristics vary significantly depending on the type and intensity of emotions. Regarding linguistic features, emotional and colloquial expressions are also observed in their utterances. To solve these problems, we aim to improve recognition performance by adapting acoustic and language models to emotional speech. We used Japanese Twitter-based Emotional Speech (JTES) as an emotional speech corpus. This corpus consisted of tweets and had an emotional label assigned to each utterance. Corpus adaptation is possible using the utterances contained in this corpus. However, regarding the language model, the amount of adaptation data is insufficient. To solve this problem, we propose an adaptation of the language model by using online tweet data downloaded from the internet. The sentences used for adaptation were extracted from the tweet data based on certain rules. We extracted the data of 25.86 M words and used them for adaptation. In the recognition experiments, the baseline word error rate was 36.11%, whereas that with the acoustic and language model adaptation was 17.77%. The results demonstrated the effectiveness of the proposed method.

  • CMND: Consistent-Aware Multi-Server Network Design Model for Delay-Sensitive Applications

    Akio KAWABATA  Bijoy CHAND CHATTERJEE  Eiji OKI  

     
    PAPER-Network System

      Vol:
    E107-B No:3
      Page(s):
    321-329

    This paper proposes a network design model, considering data consistency for a delay-sensitive distributed processing system. The data consistency is determined by collating the own state and the states of slave servers. If the state is mismatched with other servers, the rollback process is initiated to modify the state to guarantee data consistency. In the proposed model, the selected servers and the master-slave server pairs are determined to minimize the end-to-end delay and the delay for data consistency. We formulate the proposed model as an integer linear programming problem. We evaluate the delay performance and computation time. We evaluate the proposed model in two network models with two, three, and four slave servers. The proposed model reduces the delay for data consistency by up to 31 percent compared to that of a typical model that collates the status of all servers at one master server. The computation time is a few seconds, which is an acceptable time for network design before service launch. These results indicate that the proposed model is effective for delay-sensitive applications.

  • Backdoor Attacks on Graph Neural Networks Trained with Data Augmentation

    Shingo YASHIKI  Chako TAKAHASHI  Koutarou SUZUKI  

     
    LETTER

      Pubricized:
    2023/09/05
      Vol:
    E107-A No:3
      Page(s):
    355-358

    This paper investigates the effects of backdoor attacks on graph neural networks (GNNs) trained through simple data augmentation by modifying the edges of the graph in graph classification. The numerical results show that GNNs trained with data augmentation remain vulnerable to backdoor attacks and may even be more vulnerable to such attacks than GNNs without data augmentation.

  • BRsyn-Caps: Chinese Text Classification Using Capsule Network Based on Bert and Dependency Syntax

    Jie LUO  Chengwan HE  Hongwei LUO  

     
    PAPER-Natural Language Processing

      Pubricized:
    2023/11/06
      Vol:
    E107-D No:2
      Page(s):
    212-219

    Text classification is a fundamental task in natural language processing, which finds extensive applications in various domains, such as spam detection and sentiment analysis. Syntactic information can be effectively utilized to improve the performance of neural network models in understanding the semantics of text. The Chinese text exhibits a high degree of syntactic complexity, with individual words often possessing multiple parts of speech. In this paper, we propose BRsyn-caps, a capsule network-based Chinese text classification model that leverages both Bert and dependency syntax. Our proposed approach integrates semantic information through Bert pre-training model for obtaining word representations, extracts contextual information through Long Short-term memory neural network (LSTM), encodes syntactic dependency trees through graph attention neural network, and utilizes capsule network to effectively integrate features for text classification. Additionally, we propose a character-level syntactic dependency tree adjacency matrix construction algorithm, which can introduce syntactic information into character-level representation. Experiments on five datasets demonstrate that BRsyn-caps can effectively integrate semantic, sequential, and syntactic information in text, proving the effectiveness of our proposed method for Chinese text classification.

  • Content-Adaptive Optimization Framework for Universal Deep Image Compression

    Koki TSUBOTA  Kiyoharu AIZAWA  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2023/10/24
      Vol:
    E107-D No:2
      Page(s):
    201-211

    While deep image compression performs better than traditional codecs like JPEG on natural images, it faces a challenge as a learning-based approach: compression performance drastically decreases for out-of-domain images. To investigate this problem, we introduce a novel task that we call universal deep image compression, which involves compressing images in arbitrary domains, such as natural images, line drawings, and comics. Furthermore, we propose a content-adaptive optimization framework to tackle this task. This framework adapts a pre-trained compression model to each target image during testing for addressing the domain gap between pre-training and testing. For each input image, we insert adapters into the decoder of the model and optimize the latent representation extracted by the encoder and the adapter parameters in terms of rate-distortion, with the adapter parameters transmitted per image. To achieve the evaluation of the proposed universal deep compression, we constructed a benchmark dataset containing uncompressed images of four domains: natural images, line drawings, comics, and vector arts. We compare our proposed method with non-adaptive and existing adaptive compression methods, and the results show that our method outperforms them. Our code and dataset are publicly available at https://github.com/kktsubota/universal-dic.

  • Capacitive Wireless Power Transfer System with Misalignment Tolerance in Flowing Freshwater Environments

    Yasumasa NAKA  Akihiko ISHIWATA  Masaya TAMURA  

     
    PAPER-Electromagnetic Theory

      Pubricized:
    2023/08/01
      Vol:
    E107-C No:2
      Page(s):
    47-56

    The misalignment of a coupler is a significant issue for capacitive wireless power transfer (WPT). This paper presents a capacitive WPT system specifically designed for underwater drones operating in flowing freshwater environments. The primary design features include a capacitive coupler with an opposite relative position between feeding and receiving points on the coupler electrode, two phase compensation circuits, and a load-independent inverter. A stable and energy-efficient power transmission is achieved by maintaining a 90° phase difference on the coupler electrode in dielectrics with a large unloaded quality factor (Q factor), such as in freshwater. Although a 622-mm coupler electrode is required at 13.56MHz, the phase compensation circuits can reduce to 250mm as one example, which is mountable to small underwater drones. Furthermore, the electricity waste is automatically reduced using the constant-current (CC) output inverter in the event of misalignment where efficiency drops occur. Finally, their functions are simulated and demonstrated at various receiver positions and transfer distances in tap water.

  • Interdigital and Multi-Via Structures for Mushroom-Type Metasurface Reflectors

    Taisei URAKAMI  Tamami MARUYAMA  Shimpei NISHIYAMA  Manato KUSAMIZU  Akira ONO  Takahiro SHIOZAWA  

     
    PAPER-Antennas and Propagation

      Vol:
    E107-B No:2
      Page(s):
    309-320

    The novel patch element shapes with the interdigital and multi-via structures for mushroom-type metasurface reflectors are proposed for controlling the reflection phases. The interdigital structure provides a wide reflection phase range by changing the depth of the interdigital fingers. In addition, the multi-via structure provides the higher positive reflection phases such as near +180°. The sufficient reflection phase range of 360° and the low polarization dependent properties could be confirmed by the electromagnetic field simulation. The metasurface reflector for the normal incident plane wave was designed. The desired reflection angles and sharp far field patterns of the reflected beams could be confirmed in the simulation results. The prototype reflectors for the experiments should be designed in the same way as the primary reflector design of the reflector antenna. Specifically, the reflector design method based on the ray tracing method using the incident wave phase was proposed for the prototype. The experimental radiation pattern for the reflector antenna composed of the transmitting antenna (TX) and the prototype metasurface reflector was similar to the simulated radiation pattern. The effectiveness of the proposed structures and their design methods could be confirmed by these simulation and experiment results.

  • An Adaptive Energy-Efficient Uneven Clustering Routing Protocol for WSNs

    Mingyu LI  Jihang YIN  Yonggang XU  Gang HUA  Nian XU  

     
    PAPER-Network

      Vol:
    E107-B No:2
      Page(s):
    296-308

    Aiming at the problem of “energy hole” caused by random distribution of nodes in large-scale wireless sensor networks (WSNs), this paper proposes an adaptive energy-efficient balanced uneven clustering routing protocol (AEBUC) for WSNs. The competition radius is adaptively adjusted based on the node density and the distance from candidate cluster head (CH) to base station (BS) to achieve scale-controlled adaptive optimal clustering; in candidate CHs, the energy relative density and candidate CH relative density are comprehensively considered to achieve dynamic CH selection. In the inter-cluster communication, based on the principle of energy balance, the relay communication cost function is established and combined with the minimum spanning tree method to realize the optimized inter-cluster multi-hop routing, forming an efficient communication routing tree. The experimental results show that the protocol effectively saves network energy, significantly extends network lifetime, and better solves the “energy hole” problem.

  • Robust Visual Tracking Using Hierarchical Vision Transformer with Shifted Windows Multi-Head Self-Attention

    Peng GAO  Xin-Yue ZHANG  Xiao-Li YANG  Jian-Cheng NI  Fei WANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2023/10/20
      Vol:
    E107-D No:1
      Page(s):
    161-164

    Despite Siamese trackers attracting much attention due to their scalability and efficiency in recent years, researchers have ignored the background appearance, which leads to their inapplicability in recognizing arbitrary target objects with various variations, especially in complex scenarios with background clutter and distractors. In this paper, we present a simple yet effective Siamese tracker, where the shifted windows multi-head self-attention is produced to learn the characteristics of a specific given target object for visual tracking. To validate the effectiveness of our proposed tracker, we use the Swin Transformer as the backbone network and introduced an auxiliary feature enhancement network. Extensive experimental results on two evaluation datasets demonstrate that the proposed tracker outperforms other baselines.

  • A CNN-Based Multi-Scale Pooling Strategy for Acoustic Scene Classification

    Rong HUANG  Yue XIE  

     
    LETTER-Speech and Hearing

      Pubricized:
    2023/10/17
      Vol:
    E107-D No:1
      Page(s):
    153-156

    Acoustic scene classification (ASC) is a fundamental domain within the realm of artificial intelligence classification tasks. ASC-based tasks commonly employ models based on convolutional neural networks (CNNs) that utilize log-Mel spectrograms as input for gathering acoustic features. In this paper, we designed a CNN-based multi-scale pooling (MSP) strategy for ASC. The log-Mel spectrograms are utilized as the input to CNN, which is partitioned into four frequency axis segments. Furthermore, we devised four CNN channels to acquire inputs from distinct frequency ranges. The high-level features extracted from outputs in various frequency bands are integrated through frequency pyramid average pooling layers at multiple levels. Subsequently, a softmax classifier is employed to classify different scenes. Our study demonstrates that the implementation of our designed model leads to a significant enhancement in the model's performance, as evidenced by the testing of two acoustic datasets.

  • Improved Head and Data Augmentation to Reduce Artifacts at Grid Boundaries in Object Detection

    Shinji UCHINOURA  Takio KURITA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2023/10/23
      Vol:
    E107-D No:1
      Page(s):
    115-124

    We investigated the influence of horizontal shifts of the input images for one stage object detection method. We found that the object detector class scores drop when the target object center is at the grid boundary. Many approaches have focused on reducing the aliasing effect of down-sampling to achieve shift-invariance. However, down-sampling does not completely solve this problem at the grid boundary; it is necessary to suppress the dispersion of features in pixels close to the grid boundary into adjacent grid cells. Therefore, this paper proposes two approaches focused on the grid boundary to improve this weak point of current object detection methods. One is the Sub-Grid Feature Extraction Module, in which the sub-grid features are added to the input of the classification head. The other is Grid-Aware Data Augmentation, where augmented data are generated by the grid-level shifts and are used in training. The effectiveness of the proposed approaches is demonstrated using the COCO validation set after applying the proposed method to the FCOS architecture.

61-80hit(6055hit)