The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] PA(8249hit)

501-520hit(8249hit)

  • SENTEI: Filter-Wise Pruning with Distillation towards Efficient Sparse Convolutional Neural Network Accelerators

    Masayuki SHIMODA  Youki SADA  Ryosuke KURAMOCHI  Shimpei SATO  Hiroki NAKAHARA  

     
    PAPER-Computer System

      Pubricized:
    2020/08/03
      Vol:
    E103-D No:12
      Page(s):
    2463-2470

    In the realization of convolutional neural networks (CNNs) in resource-constrained embedded hardware, the memory footprint of weights is one of the primary problems. Pruning techniques are often used to reduce the number of weights. However, the distribution of nonzero weights is highly skewed, which makes it more difficult to utilize the underlying parallelism. To address this problem, we present SENTEI*, filter-wise pruning with distillation, to realize hardware-aware network architecture with comparable accuracy. The filter-wise pruning eliminates weights such that each filter has the same number of nonzero weights, and retraining with distillation retains the accuracy. Further, we develop a zero-weight skipping inter-layer pipelined accelerator on an FPGA. The equalization enables inter-filter parallelism, where a processing block for a layer executes filters concurrently with straightforward architecture. Our evaluation of semantic-segmentation tasks indicates that the resulting mIoU only decreased by 0.4 points. Additionally, the speedup and power efficiency of our FPGA implementation were 33.2× and 87.9× higher than those of the mobile GPU. Therefore, our technique realizes hardware-aware network with comparable accuracy.

  • Simultaneous Realization of Decision, Planning and Control for Lane-Changing Behavior Using Nonlinear Model Predictive Control

    Hiroyuki OKUDA  Nobuto SUGIE  Tatsuya SUZUKI  Kentaro HARAGUCHI  Zibo KANG  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2020/08/31
      Vol:
    E103-D No:12
      Page(s):
    2632-2642

    Path planning and motion control are fundamental components to realize safe and reliable autonomous driving. The discrimination of the role of these two components, however, is somewhat obscure because of strong mathematical interaction between these two components. This often results in a redundant computation in the implementation. One of attracting idea to overcome this redundancy is a simultaneous path planning and motion control (SPPMC) based on a model predictive control framework. SPPMC finds the optimal control input considering not only the vehicle dynamics but also the various constraints which reflect the physical limitations, safety constraints and so on to achieve the goal of a given behavior. In driving in the real traffic environment, decision making has also strong interaction with planning and control. This is much more emphasized in the case that several tasks are switched in some context to realize higher-level tasks. This paper presents a basic idea to integrate decision making, path planning and motion control which is able to be executed in realtime. In particular, lane-changing behavior together with the decision of its initiation is selected as the target task. The proposed idea is based on the nonlinear model predictive control and appropriate switching of the cost function and constraints in it. As the result, the decision of the initiation, planning, and control of the lane-changing behavior are achieved by solving a single optimization problem under several constraints such as safety. The validity of the proposed method is tested by using a vehicle simulator.

  • An Efficient Method for Training Deep Learning Networks Distributed

    Chenxu WANG  Yutong LU  Zhiguang CHEN  Junnan LI  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2020/09/07
      Vol:
    E103-D No:12
      Page(s):
    2444-2456

    Training deep learning (DL) is a computationally intensive process; as a result, training time can become so long that it impedes the development of DL. High performance computing clusters, especially supercomputers, are equipped with a large amount of computing resources, storage resources, and efficient interconnection ability, which can train DL networks better and faster. In this paper, we propose a method to train DL networks distributed with high efficiency. First, we propose a hierarchical synchronous Stochastic Gradient Descent (SGD) strategy, which can make full use of hardware resources and greatly increase computational efficiency. Second, we present a two-level parameter synchronization scheme which can reduce communication overhead by transmitting parameters of the first layer models in shared memory. Third, we optimize the parallel I/O by making each reader read data as continuously as possible to avoid the high overhead of discontinuous data reading. At last, we integrate the LARS algorithm into our system. The experimental results demonstrate that our approach has tremendous performance advantages relative to unoptimized methods. Compared with the native distributed strategy, our hierarchical synchronous SGD strategy (HSGD) can increase computing efficiency by about 20 times.

  • A Social Collaborative Filtering Method to Alleviate Data Sparsity Based on Graph Convolutional Networks

    Haitao XIE  Qingtao FAN  Qian XIAO  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2020/08/28
      Vol:
    E103-D No:12
      Page(s):
    2611-2619

    Nowadays recommender systems (RS) keep drawing attention from academia, and collaborative filtering (CF) is the most successful technique for building RS. To overcome the inherent limitation, which is referred to as data sparsity in CF, various solutions are proposed to incorporate additional social information into recommendation processes, such as trust networks. However, existing methods suffer from multi-source data integration (i.e., fusion of social information and ratings), which is the basis for similarity calculation of user preferences. To this end, we propose a social collaborative filtering method based on novel trust metrics. Firstly, we use Graph Convolutional Networks (GCNs) to learn the associations between social information and user ratings while considering the underlying social network structures. Secondly, we measure the direct-trust values between neighbors by representing multi-source data as user ratings on popular items, and then calculate the indirect-trust values based on trust propagations. Thirdly, we employ all trust values to create a social regularization in user-item rating matrix factorization in order to avoid overfittings. The experiments on real datasets show that our approach outperforms the other state-of-the-art methods on usage of multi-source data to alleviate data sparsity.

  • Multiple Subspace Model and Image-Inpainting Algorithm Based on Multiple Matrix Rank Minimization

    Tomohiro TAKAHASHI  Katsumi KONISHI  Kazunori URUMA  Toshihiro FURUKAWA  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2020/08/31
      Vol:
    E103-D No:12
      Page(s):
    2682-2692

    This paper proposes an image inpainting algorithm based on multiple linear models and matrix rank minimization. Several inpainting algorithms have been previously proposed based on the assumption that an image can be modeled using autoregressive (AR) models. However, these algorithms perform poorly when applied to natural photographs because they assume that an image is modeled by a position-invariant linear model with a fixed model order. In order to improve inpainting quality, this work introduces a multiple AR model and proposes an image inpainting algorithm based on multiple matrix rank minimization with sparse regularization. In doing so, a practical algorithm is provided based on the iterative partial matrix shrinkage algorithm, with numerical examples showing the effectiveness of the proposed algorithm.

  • DVNR: A Distributed Method for Virtual Network Recovery

    Guangyuan LIU  Daokun CHEN  

     
    LETTER-Information Network

      Pubricized:
    2020/08/26
      Vol:
    E103-D No:12
      Page(s):
    2713-2716

    How to restore virtual network against substrate network failure (e.g. link cut) is one of the key challenges of network virtualization. The traditional virtual network recovery (VNR) methods are mostly based on the idea of centralized control. However, if multiple virtual networks fail at the same time, their recovery processes are usually queued according to a specific priority, which may increase the average waiting time of users. In this letter, we study distributed virtual network recovery (DVNR) method to improve the virtual network recovery efficiency. We establish exclusive virtual machine (VM) for each virtual network and process recovery requests of multiple virtual networks in parallel. Simulation results show that the proposed DVNR method can obtain recovery success rate closely to centralized VNR method while yield ~70% less average recovery time.

  • Inpainting via Sparse Representation Based on a Phaseless Quality Metric

    Takahiro OGAWA  Keisuke MAEDA  Miki HASEYAMA  

     
    PAPER-Image

      Vol:
    E103-A No:12
      Page(s):
    1541-1551

    An inpainting method via sparse representation based on a new phaseless quality metric is presented in this paper. Since power spectra, phaseless features, of local regions within images enable more successful representation of their texture characteristics compared to their pixel values, a new quality metric based on these phaseless features is newly derived for image representation. Specifically, the proposed method enables spare representation of target signals, i.e., target patches, including missing intensities by monitoring errors converged by phase retrieval as the novel phaseless quality metric. This is the main contribution of our study. In this approach, the phase retrieval algorithm used in our method has the following two important roles: (1) derivation of the new quality metric that can be derived even for images including missing intensities and (2) conversion of phaseless features, i.e., power spectra, to pixel values, i.e., intensities. Therefore, the above novel approach solves the existing problem of not being able to use better features or better quality metrics for inpainting. Results of experiments showed that the proposed method using sparse representation based on the new phaseless quality metric outperforms previously reported methods that directly use pixel values for inpainting.

  • A Rabin-Karp Implementation for Handling Multiple Pattern-Matching on the GPU

    Lucas Saad Nogueira NUNES  Jacir Luiz BORDIM  Yasuaki ITO  Koji NAKANO  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2020/09/24
      Vol:
    E103-D No:12
      Page(s):
    2412-2420

    The volume of digital information is growing at an extremely fast pace which, in turn, exacerbates the need of efficient mechanisms to find the presence of a pattern in an input text or a set of input strings. Combining the processing power of Graphics Processing Unit (GPU) with matching algorithms seems a natural alternative to speedup the string-matching process. This work proposes a Parallel Rabin-Karp implementation (PRK) that encompasses a fast-parallel prefix-sums algorithm to maximize parallelization and accelerate the matching verification. Given an input text T of length n and p patterns of length m, the proposed implementation finds all occurrences of p in T in O(m+q+n/τ+nm/q) time, where q is a sufficiently large prime number and τ is the available number of threads. Sequential and parallel versions of the PRK have been implemented. Experiments have been executed on p≥1 patterns of length m comprising of m=10, 20, 30 characters which are compared against a text string of length n=227. The results show that the parallel implementation of the PRK algorithm on NVIDIA V100 GPU provides speedup surpassing 372 times when compared to the sequential implementation and speedup of 12.59 times against an OpenMP implementation running on a multi-core server with 128 threads. Compared to another prominent GPU implementation, the PRK implementation attained speedup surpassing 37 times.

  • Circuit Modeling of Wireless Power Transfer System in the Vicinity of Perfectly Conducting Scatterer

    Nozomi HAGA  Jerdvisanop CHAKAROTHAI  Keisuke KONNO  

     
    PAPER-Antennas and Propagation

      Pubricized:
    2020/06/22
      Vol:
    E103-B No:12
      Page(s):
    1411-1420

    The impedance expansion method (IEM) is a circuit-modeling technique for electrically small devices based on the method of moments. In a previous study, a circuit model of a wireless power transfer (WPT) system was developed by utilizing the IEM and eigenmode analysis. However, this technique assumes that all the coupling elements (e.g., feeding loops and resonant coils) are in the absence of neighboring scatters (e.g., bodies of vehicles). This study extends the theory of the IEM to obtain the circuit model of a WPT system in the vicinity of a perfectly conducting scatterer (PCS). The numerical results show that the proposed method can be applied to the frequencies at which the dimension of the PCS is less than approximately a quarter wavelength. In addition, the yielded circuit model is found to be valid at the operating frequency band.

  • Pilot Decontamination in Massive MIMO Uplink via Approximate Message-Passing

    Takumi FUJITSUKA  Keigo TAKEUCHI  

     
    PAPER-Communication Theory

      Pubricized:
    2020/07/01
      Vol:
    E103-A No:12
      Page(s):
    1356-1366

    Pilot contamination is addressed in massive multiple-input multiple-output (MIMO) uplink. The main ideas of pilot decontamination are twofold: One is to design transmission timing of pilot sequences such that the pilot transmission periods in different cells do not fully overlap with each other, as considered in previous works. The other is joint channel and data estimation via approximate message-passing (AMP) for bilinear inference. The convergence property of conventional AMP is bad in bilinear inference problems, so that adaptive damping was required to help conventional AMP converge. The main contribution of this paper is a modification of the update rules in conventional AMP to improve the convergence property of AMP. Numerical simulations show that the proposed AMP outperforms conventional AMP in terms of estimation performance when adaptive damping is not used. Furthermore, it achieves better performance than state-of-the-art methods based on subspace estimation when the power difference between cells is small.

  • A Multiobjective Optimization Dispatch Method of Wind-Thermal Power System

    Xiaoxuan GUO  Renxi GONG  Haibo BAO  Zhenkun LU  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2020/09/18
      Vol:
    E103-D No:12
      Page(s):
    2549-2558

    It is well known that the large-scale access of wind power to the power system will affect the economic and environmental objectives of power generation scheduling, and also bring new challenges to the traditional deterministic power generation scheduling because of the intermittency and randomness of wind power. In order to deal with these problems, a multiobjective optimization dispatch method of wind-thermal power system is proposed. The method can be described as follows: A multiobjective interval power generation scheduling model of wind-thermal power system is firstly established by describing the wind speed on wind farm as an interval variable, and the minimization of fuel cost and pollution gas emission cost of thermal power unit is chosen as the objective functions. And then, the optimistic and pessimistic Pareto frontiers of the multi-objective interval power generation scheduling are obtained by utilizing an improved normal boundary intersection method with a normal boundary intersection (NBI) combining with a bilevel optimization method to solve the model. Finally, the optimistic and pessimistic compromise solutions is determined by a distance evaluation method. The calculation results of the 16-unit 174-bus system show that by the proposed method, a uniform optimistic and pessimistic Pareto frontier can be obtained, the analysis of the impact of wind speed interval uncertainty on the economic and environmental indicators can be quantified. In addition, it has been verified that the Pareto front in the actual scenario is distributed between the optimistic and pessimistic Pareto front, and the influence of different wind power access levels on the optimistic and pessimistic Pareto fronts is analyzed.

  • Formulation of a Test Pattern Measure That Counts Distinguished Fault-Pairs for Circuit Fault Diagnosis

    Tsutomu INAMOTO  Yoshinobu HIGAMI  

     
    PAPER

      Vol:
    E103-A No:12
      Page(s):
    1456-1463

    In this paper, we aim to develop technologies for the circuit fault diagnosis and propose a formulation of a measure of a test pattern for the circuit fault diagnosis. Given a faulty circuit, the fault diagnosis is to deduce locations of faults that had occurred in the circuit. The fault diagnosis is executed in software before the failure analysis by which engineers inspect physical defects, and helps to improve the manufacturing process which yielded faulty circuits. The heart of the fault diagnosis is to distinguish between candidate faults by using test patterns, which are applied to the circuit-under-diagnosis (CUD), and thus test patterns that can distinguish as many faults as possible need to be generated. This fact motivates us to consider the test pattern measure based on the number of fault-pairs that become distinguished by a test pattern. To the best of the authors' knowledge, that measure requires the computational time of complexity order O(NF2), where NF denotes the number of candidate faults. Since NF is generally large for real industrial circuits, the computational time of the measure is long even when a high-performance computer is used. The formulation proposed in this paper makes it possible to calculate the measure in the computational complexity of O(NF log NF), and thus that measure is useful for the test pattern selection in the fault diagnosis. In computational experiments, the effectiveness of the formulation is demonstrated as samples of computational times of the measure calculated by the traditional and the proposed formulae and thorough comparisons between several greedy heuristics which are based on the measure.

  • Analysis of Decoding Error Probability of Spatially “Mt. Fuji” Coupled LDPC Codes in Waterfall Region of the BEC

    Yuta NAKAHARA  Toshiyasu MATSUSHIMA  

     
    PAPER-Coding Theory

      Vol:
    E103-A No:12
      Page(s):
    1337-1346

    A spatially “Mt. Fuji” coupled (SFC) low-density parity-check (LDPC) ensemble is a modified version of the spatially coupled (SC) LDPC ensemble. Its decoding error probability in the waterfall region has been studied only in an experimental manner. In this paper, we theoretically analyze it over the binary erasure channel by modifying the expected graph evolution (EGE) and covariance evolution (CE) that have been used to analyze the original SC-LDPC ensemble. In particular, we derive the initial condition modified for the SFC-LDPC ensemble. Then, unlike the SC-LDPC ensemble, the SFC-LDPC ensemble has a local minimum on the solution of the EGE and CE. Considering the property of it, we theoretically expect the waterfall curve of the SFC-LDPC ensemble is steeper than that of the SC-LDPC ensemble. In addition, we also confirm it by numerical experiments.

  • On the Calculation of the G-MGF for Two-Ray Fading Model with Its Applications in Communications

    Jinu GONG  Hoojin LEE  Rumin YANG  Joonhyuk KANG  

     
    LETTER-Communication Theory and Signals

      Pubricized:
    2020/05/15
      Vol:
    E103-A No:11
      Page(s):
    1308-1311

    Two-ray (TR) fading model is one of the fading models to represent a worst-case fading scenario. We derive the exact closed-form expressions of the generalized moment generating function (G-MGF) for the TR fading model, which enables us to analyze the numerous types of wireless communication applications. Among them, we carry out several analytical results for the TR fading model, including the exact ergodic capacity along with asymptotic expressions and energy detection performance. Finally, we provide numerical results to validate our evaluations.

  • Characterization of Multi-Layer Ceramic Chip Capacitors up to mm-Wave Frequencies for High-Speed Digital Signal Coupling Open Access

    Tsugumichi SHIBATA  Yoshito KATO  

     
    PAPER

      Pubricized:
    2020/04/09
      Vol:
    E103-C No:11
      Page(s):
    575-581

    Capacitive coupling of line coded and DC-balanced digital signals is often used to eliminate steady bias current flow between the systems or components in various communication systems. A multi-layer ceramic chip capacitor is promising for the capacitor of very broadband signal coupling because of its high frequency characteristics expected from the downsizing of the chip recent years. The lower limit of the coupling bandwidth is determined by the capacitance while the higher limit is affected by the parasitic inductance associated with the chip structure. In this paper, we investigate the coupling characteristics up to millimeter wave frequencies by the measurement and simulations. A phenomenon has been found in which the change in the current distribution in the chip structure occur at high frequencies and the coupling characteristics are improved compared to the prediction based on the conventional equivalent circuit model. A new equivalent circuit model of chip capacitor that can express the effect of the improvement has been proposed.

  • All-Optical PAM4 to 16QAM Modulation Format Conversion Using Nonlinear Optical Loop Mirror and 1:2 Coupler Open Access

    Yuta MATSUMOTO  Ken MISHINA  Daisuke HISANO  Akihiro MARUTA  

     
    PAPER

      Pubricized:
    2020/05/14
      Vol:
    E103-B No:11
      Page(s):
    1272-1281

    In inter-data center networks where high transmission capacity and spectral efficiency are required, a 16QAM format is deployed. On the other hand, in intra-data center networks, a PAM4 format is deployed to meet the demand for a simple and low-cost transceiver configuration. For a seamless and effective connection of such heterogeneous networks without using optical-electrical-optical conversion, an all-optical modulation format conversion technique is required. In this paper, we propose an all-optical PAM4 to 16QAM modulation format conversion using nonlinear optical loop mirror. The successful conversion operation from 2 × 26.6-Gbaud PAM4 signals to a 100-Gbps class 16QAM signal is verified by numerical simulation. Compared with an ideal 16QAM signal, the power penalty of the converted 16QAM signal can be kept within 0.51dB.

  • Concatenated LDPC/Trellis Codes: Surpassing the Symmetric Information Rate of Channels with Synchronization Errors

    Ryo SHIBATA  Gou HOSOYA  Hiroyuki YASHIMA  

     
    PAPER-Coding Theory

      Pubricized:
    2020/09/03
      Vol:
    E103-A No:11
      Page(s):
    1283-1291

    We propose a coding/decoding strategy that surpasses the symmetric information rate of a binary insertion/deletion (ID) channel and approaches the Markov capacity of the channel. The proposed codes comprise inner trellis codes and outer irregular low-density parity-check (LDPC) codes. The trellis codes are designed to mimic the transition probabilities of a Markov input process that achieves a high information rate, whereas the LDPC codes are designed to maximize an iterative decoding threshold in the superchannel (concatenation of the ID channels and trellis codes).

  • Fast Converging ADMM Penalized Decoding Method Based on Improved Penalty Function for LDPC Codes

    Biao WANG  

     
    LETTER-Coding Theory

      Pubricized:
    2020/05/08
      Vol:
    E103-A No:11
      Page(s):
    1304-1307

    For low-density parity-check (LDPC) codes, the penalized decoding method based on the alternating direction method of multipliers (ADMM) can improve the decoding performance at low signal-to-noise ratios and also has low decoding complexity. There are three effective methods that could increase the ADMM penalized decoding speed, which are reducing the number of Euclidean projections in ADMM penalized decoding, designing an effective penalty function and selecting an appropriate layered scheduling strategy for message transmission. In order to further increase the ADMM penalized decoding speed, through reducing the number of Euclidean projections and using the vertical layered scheduling strategy, this paper designs a fast converging ADMM penalized decoding method based on the improved penalty function. Simulation results show that the proposed method not only improves the decoding performance but also reduces the average number of iterations and the average decoding time.

  • OFR-Net: Optical Flow Refinement with a Pyramid Dense Residual Network

    Liping ZHANG  Zongqing LU  Qingmin LIAO  

     
    LETTER-Computer Graphics

      Pubricized:
    2020/04/30
      Vol:
    E103-A No:11
      Page(s):
    1312-1318

    This paper proposes a new and effective convolutional neural network model termed OFR-Net for optical flow refinement. The OFR-Net exploits the spatial correlation between images and optical flow fields. It adopts a pyramidal codec structure with residual connections, dense connections and skip connections within and between the encoder and decoder, to comprehensively fuse features of different scales, locally and globally. We also introduce a warp loss to restrict large displacement refinement errors. A series of experiments on the FlyingChairs and MPI Sintel datasets show that the OFR-Net can effectively refine the optical flow predicted by various methods.

  • Electro-Optic Modulator for Compensation of Third-Order Intermodulation Distortion Using Frequency Chirp Modulation

    Daichi FURUBAYASHI  Yuta KASHIWAGI  Takanori SATO  Tadashi KAWAI  Akira ENOKIHARA  Naokatsu YAMAMOTO  Tetsuya KAWANISHI  

     
    PAPER

      Pubricized:
    2020/06/05
      Vol:
    E103-C No:11
      Page(s):
    653-660

    A new structure of the electro-optic modulator to compensate the third-order intermodulation distortion (IMD3) is introduced. The modulator includes two Mach-Zehnder modulators (MZMs) operating with frequency chirp and the two modulated outputs are combined with an adequate phase difference. We revealed by theoretical analysis and numerical calculations that the IMD3 components in the receiver output could be selectively suppressed when the two MZMs operate with chirp parameters of opposite signs to each other. Spectral power of the IMD3 components in the proposed modulator was more than 15dB lower than that in a normal Mach-Zehnder modulator at modulation index between 0.15π and 0.25π rad. The IMD3 compensation properties of the proposed modulator was experimentally confirmed by using a dual parallel Mach-Zehnder modulator (DPMZM) structure. We designed and fabricated the modulator with the single-chip structure and the single-input operation by integrating with 180° hybrid coupler on the modulator substrate. Modulation signals were applied to each modulation electrode by the 180° hybrid coupler to set the chirp parameters of two MZMs of the DPMZM. The properties of the fabricated modulator were measured by using 10GHz two-tone signals. The performance of the IMD3 compensation agreed with that in the calculation. It was confirmed that the IMD3 compensation could be realized even by the fabricated modulator structure.

501-520hit(8249hit)