The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] pruning(38hit)

1-20hit(38hit)

  • Channel Pruning via Improved Grey Wolf Optimizer Pruner Open Access

    Xueying WANG  Yuan HUANG  Xin LONG  Ziji MA  

     
    LETTER-Fundamentals of Information Systems

      Pubricized:
    2024/03/07
      Vol:
    E107-D No:7
      Page(s):
    894-897

    In recent years, the increasing complexity of deep network structures has hindered their application in small resource constrained hardware. Therefore, we urgently need to compress and accelerate deep network models. Channel pruning is an effective method to compress deep neural networks. However, most existing channel pruning methods are prone to falling into local optima. In this paper, we propose a channel pruning method via Improved Grey Wolf Optimizer Pruner which called IGWO-Pruner to prune redundant channels of convolutional neural networks. It identifies pruning ratio of each layer by using Improved Grey Wolf algorithm, and then fine-tuning the new pruned network model. In experimental section, we evaluate the proposed method in CIFAR datasets and ILSVRC-2012 with several classical networks, including VGGNet, GoogLeNet and ResNet-18/34/56/152, and experimental results demonstrate the proposed method is able to prune a large number of redundant channels and parameters with rare performance loss.

  • Dataset Distillation Using Parameter Pruning Open Access

    Guang LI  Ren TOGO  Takahiro OGAWA  Miki HASEYAMA  

     
    LETTER-Image

      Pubricized:
    2023/09/06
      Vol:
    E107-A No:6
      Page(s):
    936-940

    In this study, we propose a novel dataset distillation method based on parameter pruning. The proposed method can synthesize more robust distilled datasets and improve distillation performance by pruning difficult-to-match parameters during the distillation process. Experimental results on two benchmark datasets show the superiority of the proposed method.

  • Continuous Similarity Search for Dynamic Text Streams

    Yuma TSUCHIDA  Kohei KUBO  Hisashi KOGA  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2023/09/21
      Vol:
    E106-D No:12
      Page(s):
    2026-2035

    Similarity search for data streams has attracted much attention for information recommendation. In this context, recent leading works regard the latest W items in a data stream as an evolving set and reduce similarity search for data streams to set similarity search. Whereas they consider standard sets composed of items, this paper uniquely studies similarity search for text streams and treats evolving sets whose elements are texts. Specifically, we formulate a new continuous range search problem named the CTS problem (Continuous similarity search for Text Sets). The task of the CTS problem is to find all the text streams from the database whose similarity to the query becomes larger than a threshold ε. It abstracts a scenario in which a user-based recommendation system searches similar users from social networking services. The CTS is important because it allows both the query and the database to change dynamically. We develop a fast pruning-based algorithm for the CTS. Moreover, we discuss how to speed up it with the inverted index.

  • File Tracking and Visualization Methods Using a Network Graph to Prevent Information Leakage

    Tomohiko YANO  Hiroki KUZUNO  Kenichi MAGATA  

     
    PAPER

      Pubricized:
    2023/06/20
      Vol:
    E106-D No:9
      Page(s):
    1339-1353

    Information leakage is a significant threat to organizations, and effective measures are required to protect information assets. As confidential files can be leaked through various paths, a countermeasure is necessary to prevent information leakage from various paths, from simple drag-and-drop movements to complex transformations such as encryption and encoding. However, existing methods are difficult to take countermeasures depending on the information leakage paths. Furthermore, it is also necessary to create a visualization format that can find information leakage easily and a method that can remove unnecessary parts while leaving the necessary parts of information leakage to improve visibility. This paper proposes a new information leakage countermeasure method that incorporates file tracking and visualization. The file tracking component recursively extracts all events related to confidential files. Therefore, tracking is possible even when data have transformed significantly from the original file. The visualization component represents the results of file tracking as a network graph. This allows security administrators to find information leakage even if a file is transformed through multiple events. Furthermore, by pruning the network graph using the frequency of past events, the indicators of information leakage can be more easily found by security administrators. In experiments conducted, network graphs were generated for two information leakage scenarios in which files were moved and copied. The visualization results were obtained according to the scenarios, and the network graph was pruned to reduce vertices by 17.6% and edges by 10.9%.

  • Compression of Vehicle and Pedestrian Detection Network Based on YOLOv3 Model

    Lie GUO  Yibing ZHAO  Jiandong GAO  

     
    PAPER-Intelligent Transportation Systems

      Pubricized:
    2022/06/22
      Vol:
    E106-D No:5
      Page(s):
    735-745

    The commonly used object detection algorithm based on convolutional neural network is difficult to meet the real-time requirement on embedded platform due to its large size of model, large amount of calculation, and long inference time. It is necessary to use model compression to reduce the amount of network calculation and increase the speed of network inference. This paper conducts compression of vehicle and pedestrian detection network by pruning and removing redundant parameters. The vehicle and pedestrian detection network is trained based on YOLOv3 model by using K-means++ to cluster the anchor boxes. The detection accuracy is improved by changing the proportion of categorical losses and regression losses for each category in the loss function because of the unbalanced number of targets in the dataset. A layer and channel pruning algorithm is proposed by combining global channel pruning thresholds and L1 norm, which can reduce the time cost of the network layer transfer process and the amount of computation. Network layer fusion based on TensorRT is performed and inference is performed using half-precision floating-point to improve the speed of inference. Results show that the vehicle and pedestrian detection compression network pruned 84% channels and 15 Shortcut modules can reduce the size by 32% and the amount of calculation by 17%. While the network inference time can be decreased to 21 ms, which is 1.48 times faster than the network pruned 84% channels.

  • Lookahead Search-Based Low-Complexity Multi-Type Tree Pruning Method for Versatile Video Coding (VVC) Intra Coding

    Qi TENG  Guowei TENG  Xiang LI  Ran MA  Ping AN  Zhenglong YANG  

     
    PAPER-Coding Theory

      Pubricized:
    2022/08/24
      Vol:
    E106-A No:3
      Page(s):
    606-615

    The latest versatile video coding (VVC) introduces some novel techniques such as quadtree with nested multi-type tree (QTMT), multiple transform selection (MTS) and multiple reference line (MRL). These tools improve compression efficiency compared with the previous standard H.265/HEVC, but they suffer from very high computational complexity. One of the most time-consuming parts of VVC intra coding is the coding tree unit (CTU) structure decision. In this paper, we propose a low-complexity multi-type tree (MT) pruning method for VVC intra coding. This method consists of lookahead search and MT pruning. The lookahead search process is performed to derive the approximate rate-distortion (RD) cost of each MT node at depth 2 or 3. Subsequently, the improbable MT nodes are pruned by different strategies under different cost errors. These strategies are designed according to the priority of the node. Experimental results show that the overall proposed algorithm can achieve 47.15% time saving with only 0.93% Bjøntegaard delta bit rate (BDBR) increase over natural scene sequences, and 45.39% time saving with 1.55% BDBR increase over screen content sequences, compared with the VVC reference software VTM 10.0. Such results demonstrate that our method achieves a good trade-off between computational complexity and compression quality compared to recent methods.

  • Pruning Ratio Optimization with Layer-Wise Pruning Method for Accelerating Convolutional Neural Networks

    Koji KAMMA  Sarimu INOUE  Toshikazu WADA  

     
    PAPER-Biocybernetics, Neurocomputing

      Pubricized:
    2021/09/29
      Vol:
    E105-D No:1
      Page(s):
    161-169

    Pruning is an effective technique to reduce computational complexity of Convolutional Neural Networks (CNNs) by removing redundant neurons (or weights). There are two types of pruning methods: holistic pruning and layer-wise pruning. The former selects the least important neuron from the entire model and prunes it. The latter conducts pruning layer by layer. Recently, it has turned out that some layer-wise methods are effective for reducing computational complexity of pruned models while preserving their accuracy. The difficulty of layer-wise pruning is how to adjust pruning ratio (the ratio of neurons to be pruned) in each layer. Because CNNs typically have lots of layers composed of lots of neurons, it is inefficient to tune pruning ratios by human hands. In this paper, we present Pruning Ratio Optimizer (PRO), a method that can be combined with layer-wise pruning methods for optimizing pruning ratios. The idea of PRO is to adjust pruning ratios based on how much pruning in each layer has an impact on the outputs in the final layer. In the experiments, we could verify the effectiveness of PRO.

  • REAP: A Method for Pruning Convolutional Neural Networks with Performance Preservation

    Koji KAMMA  Toshikazu WADA  

     
    PAPER-Biocybernetics, Neurocomputing

      Pubricized:
    2020/10/02
      Vol:
    E104-D No:1
      Page(s):
    194-202

    This paper presents a pruning method, Reconstruction Error Aware Pruning (REAP), to reduce the redundancy of convolutional neural network models for accelerating their inference. In REAP, we have the following steps: 1) Prune the channels whose outputs are redundant and can be reconstructed from the outputs of other channels in each convolutional layer; 2) Update the weights of the remaining channels by least squares method so as to compensate the error caused by pruning. This is how we compress and accelerate the models that are initially large and slow with little degradation. The ability of REAP to maintain the model performances saves us lots of time and labors for retraining the pruned models. The challenge of REAP is the computational cost for selecting the channels to be pruned. For selecting the channels, we need to solve a huge number of least squares problems. We have developed an efficient algorithm based on biorthogonal system to obtain the solutions of those least squares problems. In the experiments, we show that REAP can conduct pruning with smaller sacrifice of the model performances than several existing methods including the previously state-of-the-art one.

  • SENTEI: Filter-Wise Pruning with Distillation towards Efficient Sparse Convolutional Neural Network Accelerators

    Masayuki SHIMODA  Youki SADA  Ryosuke KURAMOCHI  Shimpei SATO  Hiroki NAKAHARA  

     
    PAPER-Computer System

      Pubricized:
    2020/08/03
      Vol:
    E103-D No:12
      Page(s):
    2463-2470

    In the realization of convolutional neural networks (CNNs) in resource-constrained embedded hardware, the memory footprint of weights is one of the primary problems. Pruning techniques are often used to reduce the number of weights. However, the distribution of nonzero weights is highly skewed, which makes it more difficult to utilize the underlying parallelism. To address this problem, we present SENTEI*, filter-wise pruning with distillation, to realize hardware-aware network architecture with comparable accuracy. The filter-wise pruning eliminates weights such that each filter has the same number of nonzero weights, and retraining with distillation retains the accuracy. Further, we develop a zero-weight skipping inter-layer pipelined accelerator on an FPGA. The equalization enables inter-filter parallelism, where a processing block for a layer executes filters concurrently with straightforward architecture. Our evaluation of semantic-segmentation tasks indicates that the resulting mIoU only decreased by 0.4 points. Additionally, the speedup and power efficiency of our FPGA implementation were 33.2× and 87.9× higher than those of the mobile GPU. Therefore, our technique realizes hardware-aware network with comparable accuracy.

  • Loss-Driven Channel Pruning of Convolutional Neural Networks

    Xin LONG  Xiangrong ZENG  Chen CHEN  Huaxin XIAO  Maojun ZHANG  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2020/02/17
      Vol:
    E103-D No:5
      Page(s):
    1190-1194

    The increase in computation cost and storage of convolutional neural networks (CNNs) severely hinders their applications on limited-resources devices in recent years. As a result, there is impending necessity to accelerate the networks by certain methods. In this paper, we propose a loss-driven method to prune redundant channels of CNNs. It identifies unimportant channels by using Taylor expansion technique regarding to scaling and shifting factors, and prunes those channels by fixed percentile threshold. By doing so, we obtain a compact network with less parameters and FLOPs consumption. In experimental section, we evaluate the proposed method in CIFAR datasets with several popular networks, including VGG-19, DenseNet-40 and ResNet-164, and experimental results demonstrate the proposed method is able to prune over 70% channels and parameters with no performance loss. Moreover, iterative pruning could be used to obtain more compact network.

  • A Spectral Clustering Based Filter-Level Pruning Method for Convolutional Neural Networks

    Lianqiang LI  Jie ZHU  Ming-Ting SUN  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/09/17
      Vol:
    E102-D No:12
      Page(s):
    2624-2627

    Convolutional Neural Networks (CNNs) usually have millions or even billions of parameters, which make them hard to be deployed into mobile devices. In this work, we present a novel filter-level pruning method to alleviate this issue. More concretely, we first construct an undirected fully connected graph to represent a pre-trained CNN model. Then, we employ the spectral clustering algorithm to divide the graph into some subgraphs, which is equivalent to clustering the similar filters of the CNN into the same groups. After gaining the grouping relationships among the filters, we finally keep one filter for one group and retrain the pruned model. Compared with previous pruning methods that identify the redundant filters by heuristic ways, the proposed method can select the pruning candidates more reasonably and precisely. Experimental results also show that our proposed pruning method has significant improvements over the state-of-the-arts.

  • Fast Hyperspectral Unmixing via Reweighted Sparse Regression Open Access

    Hongwei HAN  Ke GUO  Maozhi WANG  Tingbin ZHANG  Shuang ZHANG  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2019/05/28
      Vol:
    E102-D No:9
      Page(s):
    1819-1832

    The sparse unmixing of hyperspectral data has attracted much attention in recent years because it does not need to estimate the number of endmembers nor consider the lack of pure pixels in a given hyperspectral scene. However, the high mutual coherence of spectral libraries strongly affects the practicality of sparse unmixing. The collaborative sparse unmixing via variable splitting and augmented Lagrangian (CLSUnSAL) algorithm is a classic sparse unmixing algorithm that performs better than other sparse unmixing methods. In this paper, we propose a CLSUnSAL-based hyperspectral unmixing method based on dictionary pruning and reweighted sparse regression. First, the algorithm identifies a subset of the original library elements using a dictionary pruning strategy. Second, we present a weighted sparse regression algorithm based on CLSUnSAL to further enhance the sparsity of endmember spectra in a given library. Third, we apply the weighted sparse regression algorithm on the pruned spectral library. The effectiveness of the proposed algorithm is demonstrated on both simulated and real hyperspectral datasets. For simulated data cubes (DC1, DC2 and DC3), the number of the pruned spectral library elements is reduced by at least 94% and the runtime of the proposed algorithm is less than 10% of that of CLSUnSAL. For simulated DC4 and DC5, the runtime of the proposed algorithm is less than 15% of that of CLSUnSAL. For the real hyperspectral datasets, the pruned spectral library successfully reduces the original dictionary size by 76% and the runtime of the proposed algorithm is 11.21% of that of CLSUnSAL. These experimental results show that our proposed algorithm not only substantially improves the accuracy of unmixing solutions but is also much faster than some other state-of-the-art sparse unmixing algorithms.

  • GUINNESS: A GUI Based Binarized Deep Neural Network Framework for Software Programmers

    Hiroki NAKAHARA  Haruyoshi YONEKAWA  Tomoya FUJII  Masayuki SHIMODA  Shimpei SATO  

     
    PAPER-Design Tools

      Pubricized:
    2019/02/27
      Vol:
    E102-D No:5
      Page(s):
    1003-1011

    The GUINNESS (GUI based binarized neural network synthesizer) is an open-source tool flow for a binarized deep neural network toward FPGA implementation based on the GUI including both the training on the GPU and inference on the FPGA. Since all the operation is done on the GUI, the software designer is not necessary to write any scripts to design the neural network structure, training behavior, only specify the values for hyperparameters. After finishing the training, it automatically generates C++ codes to synthesis the bit-stream using the Xilinx SDSoC system design tool flow. Thus, our tool flow is suitable for the software programmers who are not familiar with the FPGA design. In our tool flow, we modify the training algorithms both the training and the inference for a binarized CNN hardware. Since the hardware has a limited number of bit precision, it lacks minimal bias in training. Also, for the inference on the hardware, the conventional batch normalization technique requires additional hardware. Our modifications solve these problems. We implemented the VGG-11 benchmark CNN on the Digilent Inc. Zedboard. Compared with the conventional binarized implementations on an FPGA, the classification accuracy was almost the same, the performance per power efficiency is 5.1 times better, as for the performance per area efficiency, it is 8.0 times better, and as for the performance per memory, it is 8.2 times better. We compare the proposed FPGA design with the CPU and the GPU designs. Compared with the ARM Cortex-A57, it was 1776.3 times faster, it dissipated 3.0 times lower power, and its performance per power efficiency was 5706.3 times better. Also, compared with the Maxwell GPU, it was 11.5 times faster, it dissipated 7.3 times lower power, and its performance per power efficiency was 83.0 times better. The disadvantage of our FPGA based design requires additional time to synthesize the FPGA executable codes. From the experiment, it consumed more three hours, and the total FPGA design took 75 hours. Since the training of the CNN is dominant, it is considerable.

  • Mining Approximate Primary Functional Dependency on Web Tables

    Siyu CHEN  Ning WANG  Mengmeng ZHANG  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2018/11/29
      Vol:
    E102-D No:3
      Page(s):
    650-654

    We propose to discover approximate primary functional dependency (aPFD) for web tables, which focus on the determination relationship between primary attributes and non-primary attributes and are more helpful for entity column detection and topic discovery on web tables. Based on association rules and information theory, we propose metrics Conf and InfoGain to evaluate PFDs. By quantifying PFDs' strength and designing pruning strategies to eliminate false positives, our method could select minimal non-trivial approximate PFD effectively and are scalable to large tables. The comprehensive experimental results on real web datasets show that our method significantly outperforms previous work in both effectiveness and efficiency.

  • Filter Level Pruning Based on Similar Feature Extraction for Convolutional Neural Networks

    Lianqiang LI  Yuhui XU  Jie ZHU  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2018/01/18
      Vol:
    E101-D No:4
      Page(s):
    1203-1206

    This paper introduces a filter level pruning method based on similar feature extraction for compressing and accelerating the convolutional neural networks by k-means++ algorithm. In contrast to other pruning methods, the proposed method would analyze the similarities in recognizing features among filters rather than evaluate the importance of filters to prune the redundant ones. This strategy would be more reasonable and effective. Furthermore, our method does not result in unstructured network. As a result, it needs not extra sparse representation and could be efficiently supported by any off-the-shelf deep learning libraries. Experimental results show that our filter pruning method could reduce the number of parameters and the amount of computational costs in Lenet-5 by a factor of 17.9× with only 0.3% accuracy loss.

  • A Threshold Neuron Pruning for a Binarized Deep Neural Network on an FPGA

    Tomoya FUJII  Shimpei SATO  Hiroki NAKAHARA  

     
    PAPER-Emerging Applications

      Pubricized:
    2017/11/17
      Vol:
    E101-D No:2
      Page(s):
    376-386

    For a pre-trained deep convolutional neural network (CNN) for an embedded system, a high-speed and a low power consumption are required. In the former of the CNN, it consists of convolutional layers, while in the latter, it consists of fully connection layers. In the convolutional layer, the multiply accumulation operation is a bottleneck, while the fully connection layer, the memory access is a bottleneck. The binarized CNN has been proposed to realize many multiply accumulation circuit on the FPGA, thus, the convolutional layer can be done with a high-seed operation. However, even if we apply the binarization to the fully connection layer, the amount of memory was still a bottleneck. In this paper, we propose a neuron pruning technique which eliminates almost part of the weight memory, and we apply it to the fully connection layer on the binarized CNN. In that case, since the weight memory is realized by an on-chip memory on the FPGA, it achieves a high-speed memory access. To further reduce the memory size, we apply the retraining the CNN after neuron pruning. In this paper, we propose a sequential-input parallel-output fully connection layer circuit for the binarized fully connection layer, while proposing a streaming circuit for the binarized 2D convolutional layer. The experimental results showed that, by the neuron pruning, as for the fully connected layer on the VGG-11 CNN, the number of neurons was reduced by 39.8% with keeping the 99% baseline accuracy. We implemented the neuron pruning CNN on the Xilinx Inc. Zynq Zedboard. Compared with the ARM Cortex-A57, it was 1773.0 times faster, it dissipated 3.1 times lower power, and its performance per power efficiency was 5781.3 times better. Also, compared with the Maxwell GPU, it was 11.1 times faster, it dissipated 7.7 times lower power, and its performance per power efficiency was 84.1 times better. Thus, the binarized CNN on the FPGA is suitable for the embedded system.

  • A Low Complexity Fixed Sphere Decoder with Statistical Threshold for MIMO Systems

    Jangyong PARK  Yunho JUNG  Jaeseok KIM  

     
    LETTER-Digital Signal Processing

      Vol:
    E98-A No:2
      Page(s):
    735-739

    In this letter, we propose a low complexity fixed sphere decoder (FSD) with statistical threshold for multiple-input and multiple-output (MIMO) systems. The proposed algorithm is developed by applying two threshold-based pruning algorithms using an initial detection and statistical noise constraint to the FSD. The proposed FSD algorithm is suitable for a fully pipelined hardware implementation and also has low complexity because the threshold of the proposed pruning algorithm is pre-calculated and independently applied to the path without sorting operation. Simulation results show that the proposed FSD has the performance of the original FSD as well as a low complexity compared to the original FSD and other low complexity FSD algorithms.

  • Efficient Top-k Document Retrieval for Long Queries Using Term-Document Binary Matrix – Pursuit of Enhanced Informational Search on the Web –

    Etsuro FUJITA  Keizo OYAMA  

     
    PAPER-Advanced Search

      Vol:
    E96-D No:5
      Page(s):
    1016-1028

    With the successful adoption of link analysis techniques such as PageRank and web spam filtering, current web search engines well support “navigational search”. However, due to the use of a simple conjunctive Boolean filter in addition to the inappropriateness of user queries, such an engine does not necessarily well support “informational search”. Informational search would be better handled by a web search engine using an informational retrieval model combined with enhancement techniques such as query expansion and relevance feedback. Moreover, the realization of such an engine requires a method to prosess the model efficiently. In this paper we propose a novel extension of an existing top-k query processing technique to improve search efficiency. We add to it the technique utilizing a simple data structure called a “term-document binary matrix,” resulting in more efficient evaluation of top-k queries even when the queries have been expanded. We show on the basis of experimental evaluation using the TREC GOV2 data set and expanded versions of the evaluation queries attached to this data set that the proposed method can speed up evaluation considerably compared with existing techniques especially when the number of query terms gets larger.

  • Homogeneous Superpixels from Markov Random Walks

    Frank PERBET  Bjorn STENGER  Atsuto MAKI  

     
    PAPER-Segmentation

      Vol:
    E95-D No:7
      Page(s):
    1740-1748

    This paper presents a novel algorithm to generate homogeneous superpixels from Markov random walks. We exploit Markov clustering (MCL) as the methodology, a generic graph clustering method based on stochastic flow circulation. In particular, we introduce a graph pruning strategy called compact pruning in order to capture intrinsic local image structure. The resulting superpixels are homogeneous, i.e. uniform in size and compact in shape. The original MCL algorithm does not scale well to a graph of an image due to the square computation of the Markov matrix which is necessary for circulating the flow. The proposed pruning scheme has the advantages of faster computation, smaller memory footprint, and straightforward parallel implementation. Through comparisons with other recent techniques, we show that the proposed algorithm achieves state-of-the-art performance.

  • Reduction of Computational Cost of POC-Based Methods for Displacement Estimation in Old Film Sequences

    Xiaoyong ZHANG  Masahide ABE  Masayuki KAWAMATA  

     
    PAPER-Digital Signal Processing

      Vol:
    E94-A No:7
      Page(s):
    1497-1504

    This paper proposes a new method that reduces the computational cost of the phase-only correlation (POC)-based methods for displacement estimation in old film sequences. Conventional POC-based methods calculate all the points of the POC and only use the highest peak of the POC and its neighboring points to estimate the displacement with subpixel accuracy. Our proposed method reduces the computational cost by calculating the POC in a small region, instead of all the points of the POC. The proposed method combines a displacement pre-estimation with a modified inverse discrete Fourier transform (IDFT). The displacement pre-estimation uses the 1-D POCs of frame projections to pre-estimate the displacement with pixel accuracy and chooses a small region in the POC including the desired points for displacement estimation. The modified IDFT is then used to calculate the points in this small region for displacement estimation. Experimental results show that use of the proposed method can effectively reduce the computational cost of the POC-based methods without compromising the accuracy.

1-20hit(38hit)