The search functionality is under construction.

Keyword Search Result

[Keyword] convolutional neural networks(32hit)

1-20hit(32hit)

  • Channel Pruning via Improved Grey Wolf Optimizer Pruner Open Access

    Xueying WANG  Yuan HUANG  Xin LONG  Ziji MA  

     
    LETTER-Fundamentals of Information Systems

      Pubricized:
    2024/03/07
      Vol:
    E107-D No:7
      Page(s):
    894-897

    In recent years, the increasing complexity of deep network structures has hindered their application in small resource constrained hardware. Therefore, we urgently need to compress and accelerate deep network models. Channel pruning is an effective method to compress deep neural networks. However, most existing channel pruning methods are prone to falling into local optima. In this paper, we propose a channel pruning method via Improved Grey Wolf Optimizer Pruner which called IGWO-Pruner to prune redundant channels of convolutional neural networks. It identifies pruning ratio of each layer by using Improved Grey Wolf algorithm, and then fine-tuning the new pruned network model. In experimental section, we evaluate the proposed method in CIFAR datasets and ILSVRC-2012 with several classical networks, including VGGNet, GoogLeNet and ResNet-18/34/56/152, and experimental results demonstrate the proposed method is able to prune a large number of redundant channels and parameters with rare performance loss.

  • Dual-Path Convolutional Neural Network Based on Band Interaction Block for Acoustic Scene Classification Open Access

    Pengxu JIANG  Yang YANG  Yue XIE  Cairong ZOU  Qingyun WANG  

     
    LETTER-Engineering Acoustics

      Pubricized:
    2023/10/04
      Vol:
    E107-A No:7
      Page(s):
    1040-1044

    Convolutional neural network (CNN) is widely used in acoustic scene classification (ASC) tasks. In most cases, local convolution is utilized to gather time-frequency information between spectrum nodes. It is challenging to adequately express the non-local link between frequency domains in a finite convolution region. In this paper, we propose a dual-path convolutional neural network based on band interaction block (DCNN-bi) for ASC, with mel-spectrogram as the model’s input. We build two parallel CNN paths to learn the high-frequency and low-frequency components of the input feature. Additionally, we have created three band interaction blocks (bi-blocks) to explore the pertinent nodes between various frequency bands, which are connected between two paths. Combining the time-frequency information from two paths, the bi-blocks with three distinct designs acquire non-local information and send it back to the respective paths. The experimental results indicate that the utilization of the bi-block has the potential to improve the initial performance of the CNN substantially. Specifically, when applied to the DCASE 2018 and DCASE 2020 datasets, the CNN exhibited performance improvements of 1.79% and 3.06%, respectively.

  • CCTSS: The Combination of CNN and Transformer with Shared Sublayer for Detection and Classification

    Aorui GOU  Jingjing LIU  Xiaoxiang CHEN  Xiaoyang ZENG  Yibo FAN  

     
    PAPER-Image

      Pubricized:
    2023/07/06
      Vol:
    E107-A No:1
      Page(s):
    141-156

    Convolutional Neural Networks (CNNs) and Transformers have achieved remarkable performance in detection and classification tasks. Nevertheless, their feature extraction cannot consider both local and global information, so the detection and classification performance can be further improved. In addition, more and more deep learning networks are designed as more and more complex, and the amount of computation and storage space required is also significantly increased. This paper proposes a combination of CNN and transformer, and designs a local feature enhancement module and global context modeling module to enhance the cascade network. While the local feature enhancement module increases the range of feature extraction, the global context modeling is used to capture the feature maps' global information. To decrease the model complexity, a shared sublayer is designed to realize the sharing of weight parameters between the adjacent convolutional layers or cross convolutional layers, thereby reducing the number of convolutional weight parameters. Moreover, to effectively improve the detection performance of neural networks without increasing network parameters, the optimal transport assignment approach is proposed to resolve the problem of label assignment. The classification loss and regression loss are the summations of the cost between the demander and supplier. The experiment results demonstrate that the proposed Combination of CNN and Transformer with Shared Sublayer (CCTSS) performs better than the state-of-the-art methods in various datasets and applications.

  • Ensemble Learning in CNN Augmented with Fully Connected Subnetworks

    Daiki HIRATA  Norikazu TAKAHASHI  

     
    LETTER-Biocybernetics, Neurocomputing

      Pubricized:
    2023/04/05
      Vol:
    E106-D No:7
      Page(s):
    1258-1261

    Convolutional Neural Networks (CNNs) have shown remarkable performance in image recognition tasks. In this letter, we propose a new CNN model called the EnsNet which is composed of one base CNN and multiple Fully Connected SubNetworks (FCSNs). In this model, the set of feature maps generated by the last convolutional layer in the base CNN is divided along channels into disjoint subsets, and these subsets are assigned to the FCSNs. Each of the FCSNs is trained independent of others so that it can predict the class label of each feature map in the subset assigned to it. The output of the overall model is determined by majority vote of the base CNN and the FCSNs. Experimental results using the MNIST, Fashion-MNIST and CIFAR-10 datasets show that the proposed approach further improves the performance of CNNs. In particular, an EnsNet achieves a state-of-the-art error rate of 0.16% on MNIST.

  • Epileptic Seizure Prediction Using Convolutional Neural Networks and Fusion Features on Scalp EEG Signals

    Qixin LAN  Bin YAO  Tao QING  

     
    LETTER-Smart Healthcare

      Pubricized:
    2022/05/27
      Vol:
    E106-D No:5
      Page(s):
    821-823

    Epileptic seizure prediction is an important research topic in the clinical epilepsy treatment, which can provide opportunities to take precautionary measures for epilepsy patients and medical staff. EEG is an commonly used tool for studying brain activity, which records the electrical discharge of brain. Many studies based on machine learning algorithms have been proposed to solve the task using EEG signal. In this study, we propose a novel seizure prediction models based on convolutional neural networks and scalp EEG for a binary classification between preictal and interictal states. The short-time Fourier transform has been used to translate raw EEG signals into STFT sepctrums, which is applied as input of the models. The fusion features have been obtained through the side-output constructions and used to train and test our models. The test results show that our models can achieve comparable results in both sensitivity and FPR upon fusion features. The proposed patient-specific model can be used in seizure prediction system for EEG classification.

  • 3D Multiple-Contextual ROI-Attention Network for Efficient and Accurate Volumetric Medical Image Segmentation

    He LI  Yutaro IWAMOTO  Xianhua HAN  Lanfen LIN  Akira FURUKAWA  Shuzo KANASAKI  Yen-Wei CHEN  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/02/21
      Vol:
    E106-D No:5
      Page(s):
    1027-1037

    Convolutional neural networks (CNNs) have become popular in medical image segmentation. The widely used deep CNNs are customized to extract multiple representative features for two-dimensional (2D) data, generally called 2D networks. However, 2D networks are inefficient in extracting three-dimensional (3D) spatial features from volumetric images. Although most 2D segmentation networks can be extended to 3D networks, the naively extended 3D methods are resource-intensive. In this paper, we propose an efficient and accurate network for fully automatic 3D segmentation. Specifically, we designed a 3D multiple-contextual extractor to capture rich global contextual dependencies from different feature levels. Then we leveraged an ROI-estimation strategy to crop the ROI bounding box. Meanwhile, we used a 3D ROI-attention module to improve the accuracy of in-region segmentation in the decoder path. Moreover, we used a hybrid Dice loss function to address the issues of class imbalance and blurry contour in medical images. By incorporating the above strategies, we realized a practical end-to-end 3D medical image segmentation with high efficiency and accuracy. To validate the 3D segmentation performance of our proposed method, we conducted extensive experiments on two datasets and demonstrated favorable results over the state-of-the-art methods.

  • Convolutional Neural Networks Based Dictionary Pair Learning for Visual Tracking

    Chenchen MENG  Jun WANG  Chengzhi DENG  Yuanyun WANG  Shengqian WANG  

     
    PAPER-Vision

      Pubricized:
    2022/02/21
      Vol:
    E105-A No:8
      Page(s):
    1147-1156

    Feature representation is a key component of most visual tracking algorithms. It is difficult to deal with complex appearance changes with low-level hand-crafted features due to weak representation capacities of such features. In this paper, we propose a novel tracking algorithm through combining a joint dictionary pair learning with convolutional neural networks (CNN). We utilize CNN model that is trained on ImageNet-Vid to extract target features. The CNN includes three convolutional layers and two fully connected layers. A dictionary pair learning follows the second fully connected layer. The joint dictionary pair is learned upon extracted deep features by the trained CNN model. The temporal variations of target appearances are learned in the dictionary learning. We use the learned dictionaries to encode target candidates. A linear combination of atoms in the learned dictionary is used to represent target candidates. Extensive experimental evaluations on OTB2015 demonstrate the superior performances against SOTA trackers.

  • Image Adjustment for Multi-Exposure Images Based on Convolutional Neural Networks

    Isana FUNAHASHI  Taichi YOSHIDA  Xi ZHANG  Masahiro IWAHASHI  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2021/10/21
      Vol:
    E105-D No:1
      Page(s):
    123-133

    In this paper, we propose an image adjustment method for multi-exposure images based on convolutional neural networks (CNNs). We call image regions without information due to saturation and object moving in multi-exposure images lacking areas in this paper. Lacking areas cause the ghosting artifact in fused images from sets of multi-exposure images by conventional fusion methods, which tackle the artifact. To avoid this problem, the proposed method estimates the information of lacking areas via adaptive inpainting. The proposed CNN consists of three networks, warp and refinement, detection, and inpainting networks. The second and third networks detect lacking areas and estimate their pixel values, respectively. In the experiments, it is observed that a simple fusion method with the proposed method outperforms state-of-the-art fusion methods in the peak signal-to-noise ratio. Moreover, the proposed method is applied for various fusion methods as pre-processing, and results show obviously reducing artifacts.

  • Radar Emitter Identification Based on Auto-Correlation Function and Bispectrum via Convolutional Neural Network

    Zhiling XIAO  Zhenya YAN  

     
    PAPER-Fundamental Theories for Communications

      Pubricized:
    2021/06/10
      Vol:
    E104-B No:12
      Page(s):
    1506-1513

    This article proposes to apply the auto-correlation function (ACF), bispectrum analysis, and convolutional neural networks (CNN) to implement radar emitter identification (REI) based on intrapulse features. In this work, we combine ACF with bispectrum for signal feature extraction. We first calculate the ACF of each emitter signal, and then the bispectrum of the ACF and obtain the spectrograms. The spectrum images are taken as the feature maps of the radar emitters and fed into the CNN classifier to realize automatic identification. We simulate signal samples of different modulation types in experiments. We also consider the feature extraction method directly using bispectrum analysis for comparison. The simulation results demonstrate that by combining ACF with bispectrum analysis, the proposed scheme can attain stronger robustness to noise, the spectrograms of our approach have more pronounced features, and our approach can achieve better identification performance at low signal-to-noise ratios.

  • Evaluation Metrics for the Cost of Data Movement in Deep Neural Network Acceleration

    Hongjie XU  Jun SHIOMI  Hidetoshi ONODERA  

     
    PAPER

      Pubricized:
    2021/06/01
      Vol:
    E104-A No:11
      Page(s):
    1488-1498

    Hardware accelerators are designed to support a specialized processing dataflow for everchanging deep neural networks (DNNs) under various processing environments. This paper introduces two hardware properties to describe the cost of data movement in each memory hierarchy. Based on the hardware properties, this paper proposes a set of evaluation metrics that are able to evaluate the number of memory accesses and the required memory capacity according to the specialized processing dataflow. Proposed metrics are able to analytically predict energy, throughput, and area of a hardware design without detailed implementation. Once a processing dataflow and constraints of hardware resources are determined, the proposed evaluation metrics quickly quantify the expected hardware benefits, thereby reducing design time.

  • Gated Convolutional Neural Networks with Sentence-Related Selection for Distantly Supervised Relation Extraction

    Yufeng CHEN  Siqi LI  Xingya LI  Jinan XU  Jian LIU  

     
    PAPER-Natural Language Processing

      Pubricized:
    2021/06/01
      Vol:
    E104-D No:9
      Page(s):
    1486-1495

    Relation extraction is one of the key basic tasks in natural language processing in which distant supervision is widely used for obtaining large-scale labeled data without expensive labor cost. However, the automatically generated data contains massive noise because of the wrong labeling problem in distant supervision. To address this problem, the existing research work mainly focuses on removing sentence-level noise with various sentence selection strategies, which however could be incompetent for disposing word-level noise. In this paper, we propose a novel neural framework considering both intra-sentence and inter-sentence relevance to deal with word-level and sentence-level noise from distant supervision, which is denoted as Sentence-Related Gated Piecewise Convolutional Neural Networks (SR-GPCNN). Specifically, 1) a gate mechanism with multi-head self-attention is adopted to reduce word-level noise inside sentences; 2) a soft-label strategy is utilized to alleviate wrong-labeling propagation problem; and 3) a sentence-related selection model is designed to filter sentence-level noise further. The extensive experimental results on NYT dataset demonstrate that our approach filters word-level and sentence-level noise effectively, thus significantly outperforms all the baseline models in terms of both AUC and top-n precision metrics.

  • CJAM: Convolutional Neural Network Joint Attention Mechanism in Gait Recognition

    Pengtao JIA  Qi ZHAO  Boze LI  Jing ZHANG  

     
    PAPER

      Pubricized:
    2021/04/28
      Vol:
    E104-D No:8
      Page(s):
    1239-1249

    Gait recognition distinguishes one individual from others according to the natural patterns of human gaits. Gait recognition is a challenging signal processing technology for biometric identification due to the ambiguity of contours and the complex feature extraction procedure. In this work, we proposed a new model - the convolutional neural network (CNN) joint attention mechanism (CJAM) - to classify the gait sequences and conduct person identification using the CASIA-A and CASIA-B gait datasets. The CNN model has the ability to extract gait features, and the attention mechanism continuously focuses on the most discriminative area to achieve person identification. We present a comprehensive transformation from gait image preprocessing to final identification. The results from 12 experiments show that the new attention model leads to a lower error rate than others. The CJAM model improved the 3D-CNN, CNN-LSTM (long short-term memory), and the simple CNN by 8.44%, 2.94% and 1.45%, respectively.

  • Low-Complexity Training for Binary Convolutional Neural Networks Based on Clipping-Aware Weight Update

    Changho RYU  Tae-Hwan KIM  

     
    LETTER-Biocybernetics, Neurocomputing

      Pubricized:
    2021/03/17
      Vol:
    E104-D No:6
      Page(s):
    919-922

    This letter presents an efficient technique to reduce the computational complexity involved in training binary convolutional neural networks (BCNN). The BCNN training shall be conducted focusing on the optimization of the sign of each weight element rather than the exact value itself in convention; in which, the sign of an element is not likely to be flipped anymore after it has been updated to have such a large magnitude to be clipped out. The proposed technique does not update such elements that have been clipped out and eliminates the computations involved in their optimization accordingly. The complexity reduction by the proposed technique is as high as 25.52% in training the BCNN model for the CIFAR-10 classification task, while the accuracy is maintained without severe degradation.

  • Heatmapping of Group People Involved in the Group Activity

    Kohei SENDO  Norimichi UKITA  

     
    PAPER

      Pubricized:
    2020/03/18
      Vol:
    E103-D No:6
      Page(s):
    1209-1216

    This paper proposes a method for heatmapping people who are involved in a group activity. Such people grouping is useful for understanding group activities. In prior work, people grouping is performed based on simple inflexible rules and schemes (e.g., based on proximity among people and with models representing only a constant number of people). In addition, several previous grouping methods require the results of action recognition for individual people, which may include erroneous results. On the other hand, our proposed heatmapping method can group any number of people who dynamically change their deployment. Our method can work independently of individual action recognition. A deep network for our proposed method consists of two input streams (i.e., RGB and human bounding-box images). This network outputs a heatmap representing pixelwise confidence values of the people grouping. Extensive exploration of appropriate parameters was conducted in order to optimize the input bounding-box images. As a result, we demonstrate the effectiveness of the proposed method for heatmapping people involved in group activities.

  • Loss-Driven Channel Pruning of Convolutional Neural Networks

    Xin LONG  Xiangrong ZENG  Chen CHEN  Huaxin XIAO  Maojun ZHANG  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2020/02/17
      Vol:
    E103-D No:5
      Page(s):
    1190-1194

    The increase in computation cost and storage of convolutional neural networks (CNNs) severely hinders their applications on limited-resources devices in recent years. As a result, there is impending necessity to accelerate the networks by certain methods. In this paper, we propose a loss-driven method to prune redundant channels of CNNs. It identifies unimportant channels by using Taylor expansion technique regarding to scaling and shifting factors, and prunes those channels by fixed percentile threshold. By doing so, we obtain a compact network with less parameters and FLOPs consumption. In experimental section, we evaluate the proposed method in CIFAR datasets with several popular networks, including VGG-19, DenseNet-40 and ResNet-164, and experimental results demonstrate the proposed method is able to prune over 70% channels and parameters with no performance loss. Moreover, iterative pruning could be used to obtain more compact network.

  • Fast Inference of Binarized Convolutional Neural Networks Exploiting Max Pooling with Modified Block Structure

    Ji-Hoon SHIN  Tae-Hwan KIM  

     
    LETTER-Software System

      Pubricized:
    2019/12/03
      Vol:
    E103-D No:3
      Page(s):
    706-710

    This letter presents a novel technique to achieve a fast inference of the binarized convolutional neural networks (BCNN). The proposed technique modifies the structure of the constituent blocks of the BCNN model so that the input elements for the max-pooling operation are binary. In this structure, if any of the input elements is +1, the result of the pooling can be produced immediately; the proposed technique eliminates such computations that are involved to obtain the remaining input elements, so as to reduce the inference time effectively. The proposed technique reduces the inference time by up to 34.11%, while maintaining the classification accuracy.

  • A New GAN-Based Anomaly Detection (GBAD) Approach for Multi-Threat Object Classification on Large-Scale X-Ray Security Images

    Joanna Kazzandra DUMAGPI  Woo-Young JUNG  Yong-Jin JEONG  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/10/23
      Vol:
    E103-D No:2
      Page(s):
    454-458

    Threat object recognition in x-ray security images is one of the important practical applications of computer vision. However, research in this field has been limited by the lack of available dataset that would mirror the practical setting for such applications. In this paper, we present a novel GAN-based anomaly detection (GBAD) approach as a solution to the extreme class-imbalance problem in multi-label classification. This method helps in suppressing the surge in false positives induced by training a CNN on a non-practical dataset. We evaluate our method on a large-scale x-ray image database to closely emulate practical scenarios in port security inspection systems. Experiments demonstrate improvement against the existing algorithm.

  • Convolutional Neural Networks for Pilot-Induced Cyclostationarity Based OFDM Signals Spectrum Sensing in Full-Duplex Cognitive Radio

    Hang LIU  Xu ZHU  Takeo FUJII  

     
    PAPER-Terrestrial Wireless Communication/Broadcasting Technologies

      Pubricized:
    2019/07/16
      Vol:
    E103-B No:1
      Page(s):
    91-102

    The spectrum sensing of the orthogonal frequency division multiplexing (OFDM) system in cognitive radio (CR) has always been challenging, especially for user terminals that utilize the full-duplex (FD) mode. We herein propose an advanced FD spectrum-sensing scheme that can be successfully performed even when severe self-interference is encountered from the user terminal. Based on the “classification-converted sensing” framework, the cyclostationary periodogram generated by OFDM pilots is exhibited in the form of images. These images are subsequently plugged into convolutional neural networks (CNNs) for classifications owing to the CNN's strength in image recognition. More importantly, to realize spectrum sensing against residual self-interference, noise pollution, and channel fading, we used adversarial training, where a CR-specific, modified training database was proposed. We analyzed the performances exhibited by the different architectures of the CNN and the different resolutions of the input image to balance the detection performance with computing capability. We proposed a design plan of the signal structure for the CR transmitting terminal that can fit into the proposed spectrum-sensing scheme while benefiting from its own transmission. The simulation results prove that our method has excellent sensing capability for the FD system; furthermore, our method achieves a higher detection accuracy than the conventional method.

  • Generating Accurate Candidate Windows by Effective Receptive Field

    Baojun ZHAO  Boya ZHAO  Linbo TANG  Baoxian WANG  

     
    LETTER-Image

      Vol:
    E102-A No:12
      Page(s):
    1925-1927

    Towards involving the convolutional neural networks into the object detection field, many computer vision tasks have achieved favorable successes. In order to adapt targets with various scales, deep feature pyramid is widely used, since the traditional object detection methods detect different objects in Gaussian image pyramid. However, due to the mismatching between the anchors and the feature distributions of targets, the accurate detection for targets with various scales is still a challenge. Considering the differences between the theoretical receptive field and effective receptive field, we propose a novel anchor generation method, which takes the effective receptive field as the standard. The proposed method is evaluated on the PASCAL VOC dataset and shows the favorable results.

  • SDChannelNets: Extremely Small and Efficient Convolutional Neural Networks

    JianNan ZHANG  JiJun ZHOU  JianFeng WU  ShengYing YANG  

     
    LETTER-Biocybernetics, Neurocomputing

      Pubricized:
    2019/09/10
      Vol:
    E102-D No:12
      Page(s):
    2646-2650

    Convolutional neural networks (CNNS) have a strong ability to understand and judge images. However, the enormous parameters and computation of CNNS have limited its application in resource-limited devices. In this letter, we used the idea of parameter sharing and dense connection to compress the parameters in the convolution kernel channel direction, thus greatly reducing the number of model parameters. On this basis, we designed Shared and Dense Channel-wise Convolutional Networks (SDChannelNets), mainly composed of Depth-wise Separable SD-Channel-wise Convolution layer. The advantage of SDChannelNets is that the number of model parameters is greatly reduced without or with little loss of accuracy. We also introduced a hyperparameter that can effectively balance the number of parameters and the accuracy of a model. We evaluated the model proposed by us through two popular image recognition tasks (CIFAR-10 and CIFAR-100). The results showed that SDChannelNets had similar accuracy to other CNNs, but the number of parameters was greatly reduced.

1-20hit(32hit)