The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] Ti(30728hit)

561-580hit(30728hit)

  • Surface Defect Image Classification of Lithium Battery Pole Piece Based on Deep Learning

    Weisheng MAO  Linsheng LI  Yifan TAO  Wenyi ZHOU  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2023/06/12
      Vol:
    E106-D No:9
      Page(s):
    1546-1555

    Aiming at the problem of low classification accuracy of surface defects of lithium battery pole pieces by traditional classification methods, an image classification algorithm for surface defects of lithium battery pole piece based on deep learning is proposed in this paper. Firstly, Wavelet Threshold and Histogram Equalization are used to preprocess the detect image to weaken influence of noise in non-defect regions and enhance defect features. Secondly, a VGG-InceptionV2 network with better performance is proposed by adding InceptionV2 structure to the improved VGG network structure. Then the original data set is expanded by rotating, flipping and contrast adjustment, and the optimal value of the model hyperparameters is determined by experiments. Finally, the model in this paper is compared with VGG16 and GoogLeNet to realize the recognition of defect types. The results show that the accuracy rate of the model in this paper for the surface pole piece defects of lithium batteries is 98.75%, and the model parameters is only 1.7M, which has certain significance for the classification of lithium battery surface pole piece defects in industry.

  • Shadow Detection Based on Luminance-LiDAR Intensity Uncorrelation

    Shogo SATO  Yasuhiro YAO  Taiga YOSHIDA  Shingo ANDO  Jun SHIMAMURA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2023/06/20
      Vol:
    E106-D No:9
      Page(s):
    1556-1563

    In recent years, there has been a growing demand for urban digitization using cameras and light detection and ranging (LiDAR). Shadows are a condition that affects measurement the most. Therefore, shadow detection technology is essential. In this study, we propose shadow detection utilizing the LiDAR intensity that depends on the surface properties of objects but not on irradiation from other light sources. Unlike conventional LiDAR-intensity-aided shadow detection methods, our method embeds the un-correlation between luminance and LiDAR intensity in each position into the optimization. The energy, which is defined by the un-correlation between luminance and LiDAR intensity in each position, is minimized by graph-cut segmentation to detect shadows. In evaluations on KITTI and Waymo datasets, our shadow-detection method outperformed the previous methods in terms of multiple evaluation indices.

  • A Lightweight and Efficient Infrared Pedestrian Semantic Segmentation Method

    Shangdong LIU  Chaojun MEI  Shuai YOU  Xiaoliang YAO  Fei WU  Yimu JI  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2023/06/13
      Vol:
    E106-D No:9
      Page(s):
    1564-1571

    The thermal imaging pedestrian segmentation system has excellent performance in different illumination conditions, but it has some drawbacks(e.g., weak pedestrian texture information, blurred object boundaries). Meanwhile, high-performance large models have higher latency on edge devices with limited computing performance. To solve the above problems, in this paper, we propose a real-time thermal infrared pedestrian segmentation method. The feature extraction layers of our method consist of two paths. Firstly, we utilize the lossless spatial downsampling to obtain boundary texture details on the spatial path. On the context path, we use atrous convolutions to improve the receptive field and obtain more contextual semantic information. Then, the parameter-free attention mechanism is introduced at the end of the two paths for effective feature selection, respectively. The Feature Fusion Module (FFM) is added to fuse the semantic information of the two paths after selection. Finally, we accelerate method inference through multi-threading techniques on the edge computing device. Besides, we create a high-quality infrared pedestrian segmentation dataset to facilitate research. The comparative experiments on the self-built dataset and two public datasets with other methods show that our method also has certain effectiveness. Our code is available at https://github.com/mcjcs001/LEIPNet.

  • Siamese Transformer for Saliency Prediction Based on Multi-Prior Enhancement and Cross-Modal Attention Collaboration

    Fazhan YANG  Xingge GUO  Song LIANG  Peipei ZHAO  Shanhua LI  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2023/06/20
      Vol:
    E106-D No:9
      Page(s):
    1572-1583

    Visual saliency prediction has improved dramatically since the advent of convolutional neural networks (CNN). Although CNN achieves excellent performance, it still cannot learn global and long-range contextual information well and lacks interpretability due to the locality of convolution operations. We proposed a saliency prediction model based on multi-prior enhancement and cross-modal attention collaboration (ME-CAS). Concretely, we designed a transformer-based Siamese network architecture as the backbone for feature extraction. One of the transformer branches captures the context information of the image under the self-attention mechanism to obtain a global saliency map. At the same time, we build a prior learning module to learn the human visual center bias prior, contrast prior, and frequency prior. The multi-prior input to another Siamese branch to learn the detailed features of the underlying visual features and obtain the saliency map of local information. Finally, we use an attention calibration module to guide the cross-modal collaborative learning of global and local information and generate the final saliency map. Extensive experimental results demonstrate that our proposed ME-CAS achieves superior results on public benchmarks and competitors of saliency prediction models. Moreover, the multi-prior learning modules enhance images express salient details, and model interpretability.

  • Discriminative Question Answering via Cascade Prompt Learning and Sentence Level Attention Mechanism

    Xiaoguang YUAN  Chaofan DAI  Zongkai TIAN  Xinyu FAN  Yingyi SONG  Zengwen YU  Peng WANG  Wenjun KE  

     
    PAPER-Natural Language Processing

      Pubricized:
    2023/06/02
      Vol:
    E106-D No:9
      Page(s):
    1584-1599

    Question answering (QA) systems are designed to answer questions based on given information or with the help of external information. Recent advances in QA systems are overwhelmingly contributed by deep learning techniques, which have been employed in a wide range of fields such as finance, sports and biomedicine. For generative QA in open-domain QA, although deep learning can leverage massive data to learn meaningful feature representations and generate free text as answers, there are still problems to limit the length and content of answers. To alleviate this problem, we focus on the variant YNQA of generative QA and propose a model CasATT (cascade prompt learning framework with the sentence-level attention mechanism). In the CasATT, we excavate text semantic information from document level to sentence level and mine evidence accurately from large-scale documents by retrieval and ranking, and answer questions with ranked candidates by discriminative question answering. Our experiments on several datasets demonstrate the superior performance of the CasATT over state-of-the-art baselines, whose accuracy score can achieve 93.1% on IR&QA Competition dataset and 90.5% on BoolQ dataset.

  • A Method to Detect Chorus Sections in Lyrics Text

    Kento WATANABE  Masataka GOTO  

     
    PAPER-Music Information Processing

      Pubricized:
    2023/06/02
      Vol:
    E106-D No:9
      Page(s):
    1600-1609

    This paper addresses the novel task of detecting chorus sections in English and Japanese lyrics text. Although chorus-section detection using audio signals has been studied, whether chorus sections can be detected from text-only lyrics is an open issue. Another open issue is whether patterns of repeating lyric lines such as those appearing in chorus sections depend on language. To investigate these issues, we propose a neural-network-based model for sequence labeling. It can learn phrase repetition and linguistic features to detect chorus sections in lyrics text. It is, however, difficult to train this model since there was no dataset of lyrics with chorus-section annotations as there was no prior work on this task. We therefore generate a large amount of training data with such annotations by leveraging pairs of musical audio signals and their corresponding manually time-aligned lyrics; we first automatically detect chorus sections from the audio signals and then use their temporal positions to transfer them to the line-level chorus-section annotations for the lyrics. Experimental results show that the proposed model with the generated data contributes to detecting the chorus sections, that the model trained on Japanese lyrics can detect chorus sections surprisingly well in English lyrics, and that patterns of repeating lyric lines are language-independent.

  • Reconfigurable Pedestrian Detection System Using Deep Learning for Video Surveillance

    M.K. JEEVARAJAN  P. NIRMAL KUMAR  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2023/06/09
      Vol:
    E106-D No:9
      Page(s):
    1610-1614

    We present a reconfigurable deep learning pedestrian detection system for surveillance systems that detect people with shadows in different lighting and heavily occluded conditions. This work proposes a region-based CNN, combined with CMOS and thermal cameras to obtain human features even under poor lighting conditions. The main advantage of a reconfigurable system with respect to processor-based systems is its high performance and parallelism when processing large amount of data such as video frames. We discuss the details of hardware implementation in the proposed real-time pedestrian detection algorithm on a Zynq FPGA. Simulation results show that the proposed integrated approach of R-CNN architecture with cameras provides better performance in terms of accuracy, precision, and F1-score. The performance of Zynq FPGA was compared to other works, which showed that the proposed architecture is a good trade-off in terms of quality, accuracy, speed, and resource utilization.

  • Multiple Layout Design Generation via a GAN-Based Method with Conditional Convolution and Attention

    Xing ZHU  Yuxuan LIU  Lingyu LIANG  Tao WANG  Zuoyong LI  Qiaoming DENG  Yubo LIU  

     
    LETTER-Computer Graphics

      Pubricized:
    2023/06/12
      Vol:
    E106-D No:9
      Page(s):
    1615-1619

    Recently, many AI-aided layout design systems are developed to reduce tedious manual intervention based on deep learning. However, most methods focus on a specific generation task. This paper explores a challenging problem to obtain multiple layout design generation (LDG), which generates floor plan or urban plan from a boundary input under a unified framework. One of the main challenges of multiple LDG is to obtain reasonable topological structures of layout generation with irregular boundaries and layout elements for different types of design. This paper formulates the multiple LDG task as an image-to-image translation problem, and proposes a conditional generative adversarial network (GAN), called LDGAN, with adaptive modules. The framework of LDGAN is based on a generator-discriminator architecture, where the generator is integrated with conditional convolution constrained by the boundary input and the attention module with channel and spatial features. Qualitative and quantitative experiments were conducted on the SCUT-AutoALP and RPLAN datasets, and the comparison with the state-of-the-art methods illustrate the effectiveness and superiority of the proposed LDGAN.

  • A Unified Design of Generalized Moreau Enhancement Matrix for Sparsity Aware LiGME Models

    Yang CHEN  Masao YAMAGISHI  Isao YAMADA  

     
    PAPER-Digital Signal Processing

      Pubricized:
    2023/02/14
      Vol:
    E106-A No:8
      Page(s):
    1025-1036

    In this paper, we propose a unified algebraic design of the generalized Moreau enhancement matrix (GME matrix) for the Linearly involved Generalized-Moreau-Enhanced (LiGME) model. The LiGME model has been established as a framework to construct linearly involved nonconvex regularizers for sparsity (or low-rank) aware estimation, where the design of GME matrix is a key to guarantee the overall convexity of the model. The proposed design is applicable to general linear operators involved in the regularizer of the LiGME model, and does not require any eigendecomposition or iterative computation. We also present an application of the LiGME model with the proposed GME matrix to a group sparsity aware least squares estimation problem. Numerical experiments demonstrate the effectiveness of the proposed GME matrix in the LiGME model.

  • Dual Cuckoo Filter with a Low False Positive Rate for Deep Packet Inspection

    Yixuan ZHANG  Meiting XUE  Huan ZHANG  Shubiao LIU  Bei ZHAO  

     
    PAPER-Algorithms and Data Structures

      Pubricized:
    2023/01/26
      Vol:
    E106-A No:8
      Page(s):
    1037-1042

    Network traffic control and classification have become increasingly dependent on deep packet inspection (DPI) approaches, which are the most precise techniques for intrusion detection and prevention. However, the increasing traffic volumes and link speed exert considerable pressure on DPI techniques to process packets with high performance in restricted available memory. To overcome this problem, we proposed dual cuckoo filter (DCF) as a data structure based on cuckoo filter (CF). The CF can be extended to the parallel mode called parallel Cuckoo Filter (PCF). The proposed data structure employs an extra hash function to obtain two potential indices of entries. The DCF magnifies the superiority of the CF with no additional memory. Moreover, it can be extended to the parallel mode, resulting in a data structure referred to as parallel Dual Cuckoo filter (PDCF). The implementation results show that using the DCF and PDCF as identification tools in a DPI system results in time improvements of up to 2% and 30% over the CF and PCF, respectively.

  • LFWS: Long-Operation First Warp Scheduling Algorithm to Effectively Hide the Latency for GPUs

    Song LIU  Jie MA  Chenyu ZHAO  Xinhe WAN  Weiguo WU  

     
    PAPER-Algorithms and Data Structures

      Pubricized:
    2023/02/10
      Vol:
    E106-A No:8
      Page(s):
    1043-1050

    GPUs have become the dominant computing units to meet the need of high performance in various computational fields. But the long operation latency causes the underutilization of on-chip computing resources, resulting in performance degradation when running parallel tasks on GPUs. A good warp scheduling strategy is an effective solution to hide latency and improve resource utilization. However, most current warp scheduling algorithms on GPUs ignore the ability of long operations to hide latency. In this paper, we propose a long-operation-first warp scheduling algorithm, LFWS, for GPU platforms. The LFWS filters warps in the ready state to a ready queue and updates the queue in time according to changes in the status of the warp. The LFWS divides the warps in the ready queue into long and short operation groups based on the type of operations in their instruction buffers, and it gives higher priority to the long-operating warp in the ready queue. This can effectively use the long operations to hide some of the latency from each other and enhance the system's ability to hide the latency. To verify the effectiveness of the LFWS, we implement the LFWS algorithm on a simulation platform GPGPU-Sim. Experiments are conducted over various CUDA applications to evaluate the performance of LFWS algorithm, compared with other five warp scheduling algorithms. The results show that the LFWS algorithm achieves an average performance improvement of 8.01% and 5.09%, respectively, over three traditional and two novel warp scheduling algorithms, effectively improving computational resource utilization on GPU.

  • Construction of Singleton-Type Optimal LRCs from Existing LRCs and Near-MDS Codes

    Qiang FU  Buhong WANG  Ruihu LI  Ruipan YANG  

     
    PAPER-Coding Theory

      Pubricized:
    2023/01/31
      Vol:
    E106-A No:8
      Page(s):
    1051-1056

    Modern large scale distributed storage systems play a central role in data center and cloud storage, while node failure in data center is common. The lost data in failure node must be recovered efficiently. Locally repairable codes (LRCs) are designed to solve this problem. The locality of an LRC is the number of nodes that participate in recovering the lost data from node failure, which characterizes the repair efficiency. An LRC is called optimal if its minimum distance attains Singleton-type upper bound [1]. In this paper, using basic techniques of linear algebra over finite field, infinite optimal LRCs over extension fields are derived from a given optimal LRC over base field(or small field). Next, this paper investigates the relation between near-MDS codes with some constraints and LRCs, further, proposes an algorithm to determine locality of dual of a given linear code. Finally, based on near-MDS codes and the proposed algorithm, those obtained optimal LRCs are shown.

  • An Integrated Convolutional Neural Network with a Fusion Attention Mechanism for Acoustic Scene Classification

    Pengxu JIANG  Yue XIE  Cairong ZOU  Li ZHAO  Qingyun WANG  

     
    LETTER-Engineering Acoustics

      Pubricized:
    2023/02/06
      Vol:
    E106-A No:8
      Page(s):
    1057-1061

    In human-computer interaction, acoustic scene classification (ASC) is one of the relevant research domains. In real life, the recorded audio may include a lot of noise and quiet clips, making it hard for earlier ASC-based research to isolate the crucial scene information in sound. Furthermore, scene information may be scattered across numerous audio frames; hence, selecting scene-related frames is crucial for ASC. In this context, an integrated convolutional neural network with a fusion attention mechanism (ICNN-FA) is proposed for ASC. Firstly, segmented mel-spectrograms as the input of ICNN can assist the model in learning the short-term time-frequency correlation information. Then, the designed ICNN model is employed to learn these segment-level features. In addition, the proposed global attention layer may gather global information by integrating these segment features. Finally, the developed fusion attention layer is utilized to fuse all segment-level features while the classifier classifies various situations. Experimental findings using ASC datasets from DCASE 2018 and 2019 indicate the efficacy of the suggested method.

  • Low-Cost Learning-Based Path Loss Estimation Using Correlation Graph CNN

    Keita IMAIZUMI  Koichi ICHIGE  Tatsuya NAGAO  Takahiro HAYASHI  

     
    LETTER-Communication Theory and Signals

      Pubricized:
    2023/01/26
      Vol:
    E106-A No:8
      Page(s):
    1072-1076

    In this paper, we propose a method for predicting radio wave propagation using a correlation graph convolutional neural network (C-Graph CNN). We examine what kind of parameters are suitable to be used as system parameters in C-Graph CNN. Performance of the proposed method is evaluated by the path loss estimation accuracy and the computational cost through simulation.

  • New Bounds on the Partial Hamming Correlation of Wide-Gap Frequency-Hopping Sequences with Frequency Shift

    Qianhui WEI  Zengqing LI  Hongyu HAN  Hanzhou WU  

     
    LETTER-Spread Spectrum Technologies and Applications

      Pubricized:
    2023/01/23
      Vol:
    E106-A No:8
      Page(s):
    1077-1080

    In frequency hopping communication, time delay and Doppler shift incur interference. With the escalating upgrading of complicated interference, in this paper, the time-frequency two-dimensional (TFTD) partial Hamming correlation (PHC) properties of wide-gap frequency-hopping sequences (WGFHSs) with frequency shift are discussed. A bound on the maximum TFTD partial Hamming auto-correlation (PHAC) and two bounds on the maximum TFTD PHC of WGFHSs are got. Li-Fan-Yang bounds are the particular cases of new bounds for frequency shift is zero.

  • Signal Detection for OTFS System Based on Improved Particle Swarm Optimization

    Jurong BAI  Lin LAN  Zhaoyang SONG  Huimin DU  

     
    PAPER-Fundamental Theories for Communications

      Pubricized:
    2023/02/16
      Vol:
    E106-B No:8
      Page(s):
    614-621

    The orthogonal time frequency space (OTFS) technique proposed in recent years has excellent anti-Doppler frequency shift and time delay performance, enabling its application in high speed communication scenarios. In this article, a particle swarm optimization (PSO) signal detection algorithm for OTFS system is proposed, an adaptive mechanism for the individual learning factor and global learning factor in the speed formula of the algorithm is designed, and the position update method of the particles is improved, so as to increase the convergence accuracy and avoid the particles to fall into local optimum. The simulation results show that the improved PSO algorithm has the advantages of low bit error rate (BER) and high convergence accuracy compared with the traditional PSO algorithm, and has similar performance to the ideal state maximum likelihood (ML) detection algorithm with lower complexity. In the case of high Doppler shift, OTFS technology has better performance than orthogonal frequency division multiplexing (OFDM) technology by using improved PSO algorithm.

  • Intrusion Detection Model of Internet of Things Based on LightGBM Open Access

    Guosheng ZHAO  Yang WANG  Jian WANG  

     
    PAPER-Fundamental Theories for Communications

      Pubricized:
    2023/02/20
      Vol:
    E106-B No:8
      Page(s):
    622-634

    Internet of Things (IoT) devices are widely used in various fields. However, their limited computing resources make them extremely vulnerable and difficult to be effectively protected. Traditional intrusion detection systems (IDS) focus on high accuracy and low false alarm rate (FAR), making them often have too high spatiotemporal complexity to be deployed in IoT devices. In response to the above problems, this paper proposes an intrusion detection model of IoT based on the light gradient boosting machine (LightGBM). Firstly, the one-dimensional convolutional neural network (CNN) is used to extract features from network traffic to reduce the feature dimensions. Then, the LightGBM is used for classification to detect the type of network traffic belongs. The LightGBM is more lightweight on the basis of inheriting the advantages of the gradient boosting tree. The LightGBM has a faster decision tree construction process. Experiments on the TON-IoT and BoT-IoT datasets show that the proposed model has stronger performance and more lightweight than the comparison models. The proposed model can shorten the prediction time by 90.66% and is better than the comparison models in accuracy and other performance metrics. The proposed model has strong detection capability for denial of service (DoS) and distributed denial of service (DDoS) attacks. Experimental results on the testbed built with IoT devices such as Raspberry Pi show that the proposed model can perform effective and real-time intrusion detection on IoT devices.

  • Threshold Based D-SCFlip Decoding of Polar Codes

    Desheng WANG  Jihang YIN  Yonggang XU  Xuan YANG  Gang HUA  

     
    PAPER-Fundamental Theories for Communications

      Pubricized:
    2023/02/06
      Vol:
    E106-B No:8
      Page(s):
    635-644

    The decoders, which improve the error-correction performance by finding and correcting the error bits caused by channel noise, are a hotspot for polar codes. In this paper, we present a threshold based D-SCFlip (TD-SCFlip) decoder with two improvements based on the D-SCFlip decoder. First, we propose the LLR fidelity criterion to define the LLR threshold and investigate confidence probability to calculate the LLR threshold indirectly. The information bits whose LLR values are smaller than the LLR threshold will be excluded from the range of candidate bits, which reduces the complexity of constructing the flip-bits list without the loss of error-correction performance. Second, we improve the calculation method for flip-bits metric with two perturbation parameters, which locates the channel-induced error bits faster, thus improving the error-correction performance. Then, TD-SCFlip-ω decoder is also proposed, which is limited to correcting up to ω bits in each extra decoding attempt. Simulation results show that the TD-SCFlip decoding is slightly better than the D-SCFlip decoding in terms of error-correction performance and decoding complexity, while the error-correction performance of TD-SCFlip-ω decoding is comparable to that of D-SCFlip-ω decoding but with lower decoding complexity.

  • Development of a Simple and Lightweight Phantom for Evaluating Human Body Avoidance Technology in Microwave Wireless Power Transfer Open Access

    Kazuki SATO  Kazuyuki SAITO  

     
    PAPER-Energy in Electronics Communications

      Pubricized:
    2023/02/15
      Vol:
    E106-B No:8
      Page(s):
    645-651

    In recent years, microwave wireless power transfer (WPT) has attracted considerable attention due to the increasing demand for various sensors and Internet of Things (IoT) applications. Microwave WPT requires technology that can detect and avoid human bodies in the transmission path. Using a phantom is essential for developing such technology in terms of standardization and human body protection from electromagnetic radiation. In this study, a simple and lightweight phantom was developed focusing on its radar cross-section (RCS) to evaluate human body avoidance technology for use in microwave WPT systems. The developed phantom's RCS is comparable to that of the human body.

  • Level Allocation of Four-Level Pulse-Amplitude Modulation Signal in Optically Pre-Amplified Receiver Systems

    Hiroki KAWAHARA  Koji IGARASHI  Kyo INOUE  

     
    PAPER-Fiber-Optic Transmission for Communications

      Pubricized:
    2023/02/03
      Vol:
    E106-B No:8
      Page(s):
    652-659

    This study numerically investigates the symbol-level allocation of four-level pulse-amplitude modulation (PAM4) signals for optically pre-amplified receiver systems. Three level-allocation schemes are examined: intensity-equispaced, amplitude-equispaced, and numerically optimized. Numerical simulations are conducted to comprehensively compare the receiver sensitivities for these level-allocation schemes under various system conditions. The results show that the superiority or inferiority between the level allocations is significantly dependent on the system conditions of the bandwidth of amplified spontaneous emission light, modulation bandwidth, and signal extinction ratio (ER). The mechanisms underlying these dependencies are also discussed.

561-580hit(30728hit)