The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] APPR(525hit)

41-60hit(525hit)

  • Exploiting Configurable Approximations for Tolerating Aging-induced Timing Violations

    Toshinori SATO  Tomoaki UKEZONO  

     
    PAPER

      Vol:
    E103-A No:9
      Page(s):
    1028-1036

    This paper proposes a technique that increases the lifetime of large scale integration (LSI) devices. As semiconductor technology improves at miniaturizing transistors, aging effects due to bias temperature instability (BTI) seriously affects their lifetime. BTI increases the threshold voltage of transistors thereby also increasing the delay of an electronics device, resulting in failures due to timing violations. To compensate for aging-induced timing violations, we exploit configurable approximate computing. Assuming that target circuits have exact and approximate modes, they are configured for the approximate mode if an aging sensor predicts violations. Experiments using an example circuit revealed an increase in its lifetime to >10 years.

  • Approximate FPGA-Based Multipliers Using Carry-Inexact Elementary Modules

    Yi GUO  Heming SUN  Ping LEI  Shinji KIMURA  

     
    PAPER

      Vol:
    E103-A No:9
      Page(s):
    1054-1062

    Approximate multiplier design is an effective technique to improve hardware performance at the cost of accuracy loss. The current approximate multipliers are mostly ASIC-based and are dedicated for one particular application. In contrast, FPGA has been an attractive choice for many applications because of its high performance, reconfigurability, and fast development round. This paper presents a novel methodology for designing approximate multipliers by employing the FPGA-based fabrics (primarily look-up tables and carry chains). The area and latency are significantly reduced by applying approximation on carry results and cutting the carry propagation path in the multiplier. Moreover, we explore higher-order multipliers on architectural space by using our proposed small-size approximate multipliers as elementary modules. For different accuracy-hardware requirements, eight configurations for approximate 8×8 multiplier are discussed. In terms of mean relative error distance (MRED), the error of the proposed 8×8 multiplier is as low as 1.06%. Compared with the exact multiplier, our proposed design can reduce area by 43.66% and power by 24.24%. The critical path latency reduction is up to 29.50%. The proposed multiplier design has a better accuracy-hardware tradeoff than other designs with comparable accuracy. Moreover, image sharpening processing is used to assess the efficiency of approximate multipliers on application.

  • Neural Networks Probability-Based PWL Sigmoid Function Approximation

    Vantruong NGUYEN  Jueping CAI  Linyu WEI  Jie CHU  

     
    LETTER-Biocybernetics, Neurocomputing

      Pubricized:
    2020/06/11
      Vol:
    E103-D No:9
      Page(s):
    2023-2026

    In this letter, a piecewise linear (PWL) sigmoid function approximation based on the statistical distribution probability of the neurons' values in each layer is proposed to improve the network recognition accuracy with only addition circuit. The sigmoid function is first divided into three fixed regions, and then according to the neurons' values distribution probability, the curve in each region is segmented into sub-regions to reduce the approximation error and improve the recognition accuracy. Experiments performed on Xilinx's FPGA-XC7A200T for MNIST and CIFAR-10 datasets show that the proposed method achieves 97.45% recognition accuracy in DNN, 98.42% in CNN on MNIST and 72.22% on CIFAR-10, up to 0.84%, 0.57% and 2.01% higher than other approximation methods with only addition circuit.

  • Reduced Complexity Successive-Cancellation Decoding of Polar Codes Based on Linear Approximation

    Yongli YAN  Xuanxuan ZHANG  Bin WU  

     
    LETTER-Information Theory

      Vol:
    E103-A No:8
      Page(s):
    995-999

    In this letter, the principle of LLR-based successive-cancellation (SC) polar decoding algorithm is explored. In order to simplify the logarithm and exponential operations in the updating rules for polar codes, we further utilize a piece-wise linear algorithm to approximate the transcendental functions, where the piece-wise linear algorithm only consists of multiplication and addition operations. It is demonstrated that with one properly allowable maximum error δ chosen for success-failure algorithm, performances approach to that of the standard SC algorithm can be achieved. Besides, the complexity reduction is realized by calculating a linear function instead of nonlinear function. Simulation results show that our proposed piece-wise SC decoder greatly reduces the complexity of the SC-based decoders with no loss in error correcting performance.

  • A Flexible Overloaded MIMO Receiver with Adaptive Selection of Extended Rotation Matrices

    Satoshi DENNO  Akihiro KITAMOTO  Ryosuke SAWADA  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2020/01/17
      Vol:
    E103-B No:7
      Page(s):
    787-795

    This paper proposes a novel flexible receiver with virtual channels for overloaded multiple-input multiple-output (MIMO) channels. The receiver applies extended rotation matrices proposed in the paper for the flexibility. In addition, adaptive selection of the extended rotation matrices is proposed for further performance improvement. We propose two techniques to reduce the computational complexity of the adaptive selection. As a result, the proposed receiver gives us an option to reduce the complexity with a slight decrease in the transmission performance by changing receiver configuration parameters. A computer simulation reveals that the adaptive selection attains a gain of about 3dB at the BER of 10-3.

  • Efficient Hybrid DOA Estimation for Massive Uniform Rectangular Array

    Wei JHANG  Shiaw-Wu CHEN  Ann-Chen CHANG  

     
    LETTER-Digital Signal Processing

      Vol:
    E103-A No:6
      Page(s):
    836-840

    In this letter, an efficient hybrid direction-of-arrival (DOA) estimation scheme is devised for massive uniform rectangular array. In this scheme, the DOA estimator based on a two-dimensional (2D) discrete Fourier transform is first applied to acquire coarse initial DOA estimates for single data snapshot. Then, the fine DOA is accurately estimated through using the iterative search estimator within a very small region. Meanwhile, a Nyström-based method is utilized to correctly compute the required noise-subspace projection matrix, avoiding the direct computation of full-dimensional sample correlation matrix and its eigenvalue decomposition. Therefore, the proposed scheme not only can estimate DOA, but also save computational cost, especially in massive antenna arrays scenarios. Simulation results are included to demonstrate the effectiveness of the proposed hybrid estimate scheme.

  • An Approximation Algorithm for the 2-Dispersion Problem

    Kazuyuki AMANO  Shin-ichi NAKANO  

     
    PAPER

      Pubricized:
    2019/11/28
      Vol:
    E103-D No:3
      Page(s):
    506-508

    Let P be a set of points on the plane, and d(p, q) be the distance between a pair of points p, q in P. For a point p∈P and a subset S ⊂ P with |S|≥3, the 2-dispersion cost, denoted by cost2(p, S), of p with respect to S is the sum of (1) the distance from p to the nearest point in Ssetminus{p} and (2) the distance from p to the second nearest point in Ssetminus{p}. The 2-dispersion cost cost2(S) of S ⊂ P with |S|≥3 is minp∈S{cost2(p, S)}. Given a set P of n points and an integer k we wish to compute k point subset S of P with maximum cost2(S). In this paper we give a simple 1/({4sqrt{3}}) approximation algorithm for the problem.

  • An Accuracy-Configurable Adder for Low-Power Applications

    Tongxin YANG  Toshinori SATO  Tomoaki UKEZONO  

     
    PAPER

      Vol:
    E103-C No:3
      Page(s):
    68-76

    Addition is a key fundamental function for many error-tolerant applications. Approximate addition is considered to be an efficient technique for trading off energy against performance and accuracy. This paper proposes a carry-maskable adder whose accuracy can be configured at runtime. The proposed scheme can dynamically select the length of the carry propagation to satisfy the quality requirements flexibly. Compared with a conventional ripple carry adder and a conventional carry look-ahead adder, the proposed 16-bit adder reduced the power consumption by 54.1% and 57.5%, respectively, and the critical path delay by 72.5% and 54.2%, respectively. In addition, results from an image processing application indicate that the quality of processed images can be controlled by the proposed adder. Good scalability of the proposed adder is demonstrated from the evaluation results using a 32-bit length.

  • Software Process Capability Self-Assessment Support System Based on Task and Work Product Characteristics: A Case Study of ISO/IEC 29110 Standard

    Apinporn METHAWACHANANONT  Marut BURANARACH  Pakaimart AMSURIYA  Sompol CHAIMONGKHON  Kamthorn KRAIRAKSA  Thepchai SUPNITHI  

     
    PAPER-Software Engineering

      Pubricized:
    2019/10/17
      Vol:
    E103-D No:2
      Page(s):
    339-347

    A key driver of software business growth in developing countries is the survival of software small and medium-sized enterprises (SMEs). Quality of products is a critical factor that can indicate the future of the business by building customer confidence. Software development agencies need to be aware of meeting international standards in software development process. In practice, consultants and assessors are usually employed as the primary solution, which can impact the budget in case of small businesses. Self-assessment tools for software development process can potentially reduce time and cost of formal assessment for software SMEs. However, the existing support methods and tools are largely insufficient in terms of process coverage and semi-automated evaluation. This paper proposes to apply a knowledge-based approach in development of a self-assessment and gap analysis support system for the ISO/IEC 29110 standard. The approach has an advantage that insights from domain experts and the standard are captured in the knowledge base in form of decision tables that can be flexibly managed. Our knowledge base is unique in that task lists and work products defined in the standard are broken down into task and work product characteristics, respectively. Their relation provides the links between Task List and Work Product which make users more understand and influence self-assessment. A prototype support system was developed to assess the level of software development capability of the agencies based on the ISO/IEC 29110 standard. A preliminary evaluation study showed that the system can improve performance of users who are inexperienced in applying ISO/IEC 29110 standard in terms of task coverage and user's time and effort compared to the traditional self-assessment method.

  • π/N Expansion to the LP01 Mode of a Step-Index N-Sided Regular-Polygonal-Core Fiber

    Naofumi KITSUNEZAKI  

     
    PAPER

      Vol:
    E103-C No:1
      Page(s):
    3-10

    Herein, we analytically derive the effective index and field distribution of the LP01 mode of a step-index N-sided regular-polygonal-core fiber. To do this, we utilize the lowest-order non-anomalous approximation of the π/N expansion. These properties are also calculated numerically and the results are compared the with approximations.

  • An Adaptive Fusion Successive Cancellation List Decoder for Polar Codes with Cyclic Redundancy Check

    Yuhuan WANG  Hang YIN  Zhanxin YANG  Yansong LV  Lu SI  Xinle YU  

     
    PAPER-Fundamental Theories for Communications

      Pubricized:
    2019/07/08
      Vol:
    E103-B No:1
      Page(s):
    43-51

    In this paper, we propose an adaptive fusion successive cancellation list decoder (ADF-SCL) for polar codes with single cyclic redundancy check. The proposed ADF-SCL decoder reasonably avoids unnecessary calculations by selecting the successive cancellation (SC) decoder or the adaptive successive cancellation list (AD-SCL) decoder depending on a log-likelihood ratio (LLR) threshold in the decoding process. Simulation results show that compared to the AD-SCL decoder, the proposed decoder can achieve significant reduction of the average complexity in the low signal-to-noise ratio (SNR) region without degradation of the performance. When Lmax=32 and Eb/N0=0.5dB, the average complexity of the proposed decoder is 14.23% lower than that of the AD-SCL decoder.

  • Design of Low-Cost Approximate Multipliers Based on Probability-Driven Inexact Compressors

    Yi GUO  Heming SUN  Ping LEI  Shinji KIMURA  

     
    PAPER

      Vol:
    E102-A No:12
      Page(s):
    1781-1791

    Approximate computing has emerged as a promising approach for error-tolerant applications to improve hardware performance at the cost of some loss of accuracy. Multiplication is a key arithmetic operation in these applications. In this paper, we propose a low-cost approximate multiplier design by employing new probability-driven inexact compressors. This compressor design is introduced to reduce the height of partial product matrix into two rows, based on the probability distribution of the sum result of partial products. To compensate the accuracy loss of the multiplier, a grouped error recovery scheme is proposed and achieves different levels of accuracy. In terms of mean relative error distance (MRED), the accuracy losses of the proposed multipliers are from 1.07% to 7.86%. Compared with the Wallace multiplier using 40nm process, the most accurate variant of the proposed multipliers can reduce power by 59.75% and area by 42.47%. The critical path delay reduction is larger than 12.78%. The proposed multiplier design has a better accuracy-performance trade-off than other designs with comparable accuracy. In addition, the efficiency of the proposed multiplier design is assessed in an image processing application.

  • Hardware-Aware Sum-Product Decoding in the Decision Domain Open Access

    Mizuki YAMADA  Keigo TAKEUCHI  Kiyoyuki KOIKE  

     
    PAPER-Coding Theory

      Vol:
    E102-A No:12
      Page(s):
    1980-1987

    We propose hardware-aware sum-product (SP) decoding for low-density parity-check codes. To simplify an implementation using a fixed-point number representation, we transform SP decoding in the logarithm domain to that in the decision domain. A polynomial approximation is proposed to implement an update rule of the proposed SP decoding efficiently. Numerical simulations show that the approximate SP decoding achieves almost the same performance as the exact SP decoding when an appropriate degree in the polynomial approximation is used, that it improves the convergence properties of SP and normalized min-sum decoding in the high signal-to-noise ratio regime, and that it is robust against quantization errors.

  • A Lightweight Method to Evaluate Effect of Approximate Memory with Hardware Performance Monitors

    Soramichi AKIYAMA  

     
    PAPER-Computer System

      Pubricized:
    2019/09/02
      Vol:
    E102-D No:12
      Page(s):
    2354-2365

    The latency and the energy consumption of DRAM are serious concerns because (1) the latency has not improved much for decades and (2) recent machines have huge capacity of main memory. Device-level studies reduce them by shortening the wait time of DRAM internal operations so that they finish fast and consume less energy. Applying these techniques aggressively to achieve approximate memory is a promising direction to further reduce the overhead, given that many data-center applications today are to some extent robust to bit-flips. To advance research on approximate memory, it is required to evaluate its effect to applications so that both researchers and potential users of approximate memory can investigate how it affects realistic applications. However, hardware simulators are too slow to run workloads repeatedly with different parameters. To this end, we propose a lightweight method to evaluate effect of approximate memory. The idea is to count the number of DRAM internal operations that occur to approximate data of applications and calculate the probability of bit-flips based on it, instead of using heavy-weight simulators. The evaluation shows that our system is 3 orders of magnitude faster than cycle accurate simulators, and we also give case studies of evaluating effect of approximate memory to some realistic applications.

  • Latent Words Recurrent Neural Network Language Models for Automatic Speech Recognition

    Ryo MASUMURA  Taichi ASAMI  Takanobu OBA  Sumitaka SAKAUCHI  Akinori ITO  

     
    PAPER-Speech and Hearing

      Pubricized:
    2019/09/25
      Vol:
    E102-D No:12
      Page(s):
    2557-2567

    This paper demonstrates latent word recurrent neural network language models (LW-RNN-LMs) for enhancing automatic speech recognition (ASR). LW-RNN-LMs are constructed so as to pick up advantages in both recurrent neural network language models (RNN-LMs) and latent word language models (LW-LMs). The RNN-LMs can capture long-range context information and offer strong performance, and the LW-LMs are robust for out-of-domain tasks based on the latent word space modeling. However, the RNN-LMs cannot explicitly capture hidden relationships behind observed words since a concept of a latent variable space is not present. In addition, the LW-LMs cannot take into account long-range relationships between latent words. Our idea is to combine RNN-LM and LW-LM so as to compensate individual disadvantages. The LW-RNN-LMs can support both a latent variable space modeling as well as LW-LMs and a long-range relationship modeling as well as RNN-LMs at the same time. From the viewpoint of RNN-LMs, LW-RNN-LM can be considered as a soft class RNN-LM with a vast latent variable space. In contrast, from the viewpoint of LW-LMs, LW-RNN-LM can be considered as an LW-LM that uses the RNN structure for latent variable modeling instead of an n-gram structure. This paper also details a parameter inference method and two kinds of implementation methods, an n-gram approximation and a Viterbi approximation, for introducing the LW-LM to ASR. Our experiments show effectiveness of LW-RNN-LMs on a perplexity evaluation for the Penn Treebank corpus and an ASR evaluation for Japanese spontaneous speech tasks.

  • Output Feedback Consensus of Lower Triangular Nonlinear Systems under a Switching Topology

    Sungryul LEE  

     
    LETTER-Digital Signal Processing

      Vol:
    E102-A No:11
      Page(s):
    1550-1555

    The output feedback consensus problem of lower triangular nonlinear systems under a directed network with a switching topology is studied. It is assumed that every possible network topology contains a directed spanning tree. The proposed design method utilizes a high gain approach to compensate for triangular nonlinearity and to remove the restriction imposed on dwell time. Compared to the previous research, it is shown that the proposed control method can achieve the output feedback consensus of lower triangular nonlinear systems even in the presence of an arbitrarily small average dwell time. A numerical example is given to illustrate the effectiveness of the proposed design method.

  • An Approximation Algorithm for the Maximum Induced Matching Problem on C5-Free Regular Graphs

    Yuichi ASAHIRO  Guohui LIN  Zhilong LIU  Eiji MIYANO  

     
    PAPER-Optimization

      Vol:
    E102-A No:9
      Page(s):
    1142-1149

    In this paper, we investigate the maximum induced matching problem (MaxIM) on C5-free d-regular graphs. The previously known best approximation ratio for MaxIM on C5-free d-regular graphs is $left( rac{3d}{4}- rac{1}{8}+ rac{3}{16d-8} ight)$. In this paper, we design a $left( rac{2d}{3}+ rac{1}{3} ight)$-approximation algorithm, whose approximation ratio is strictly smaller/better than the previous one when d≥6.

  • Programmable Analog Calculation Unit with Two-Stage Architecture: A Solution of Efficient Vector-Computation Open Access

    Renyuan ZHANG  Takashi NAKADA  Yasuhiko NAKASHIMA  

     
    PAPER

      Vol:
    E102-A No:7
      Page(s):
    878-885

    A programmable analog calculation unit (ACU) is designed for vector computations in continuous-time with compact circuit scale. From our early study, it is feasible to retrieve arbitrary two-variable functions through support vector regression (SVR) in silicon. In this work, the dimensions of regression are expanded for vector computations. However, the hardware cost and computing error greatly increase along with the expansion of dimensions. A two-stage architecture is proposed to organize multiple ACUs for high dimensional regression. The computation of high dimensional vectors is separated into several computations of lower dimensional vectors, which are implemented by the free combination of several ACUs with lower cost. In this manner, the circuit scale and regression error are reduced. The proof-of-concept ACU is designed and simulated in a 0.18μm technology. From the circuit simulation results, all the demonstrated calculations with nine operands are executed without iterative clock cycles by 4960 transistors. The calculation error of example functions is below 8.7%.

  • Analysis of Regular Sampling of Chaotic Waveform and Chaotic Sampling of Regular Waveform for Random Number Generation

    Kaya DEMiR  Salih ERGÜN  

     
    PAPER

      Vol:
    E102-A No:6
      Page(s):
    767-774

    This paper presents an analysis of random number generators based on continuous-time chaotic oscillators. Two different methods for random number generation have been studied: 1) Regular sampling of a chaotic waveform, and 2) Chaotic sampling of a regular waveform. Kernel density estimation is used to analytically describe the distribution of chaotic state variables and the probability density function corresponding to the output bit stream. Random bit sequences are generated using analytical equations and results from numerical simulations. Applying the concepts of autocorrelation and approximate entropy, randomness quality of the generated bit sequences are assessed to analyze relationships between the frequencies of the regular and chaotic waveforms used in both random number generation methods. It is demonstrated that in both methods, there exists certain ratios between the frequencies of regular and chaotic signal at which the randomness of the output bit stream changes abruptly. Furthermore, both random number generation methods have been compared against their immunity to interference from external signals. Analysis shows that chaotic sampling of regular waveform method provides more robustness against interference compared to regular sampling of chaotic waveform method.

  • Distributed Compressed Sensing via Generalized Approximate Message Passing for Jointly Sparse Signals

    Jingjing SI  Yinbo CHENG  Kai LIU  

     
    LETTER-Image

      Vol:
    E102-A No:4
      Page(s):
    702-707

    Generalized approximate message passing (GAMP) is introduced into distributed compressed sensing (DCS) to reconstruct jointly sparse signals under the mixed support-set model. A GAMP algorithm with known support-set is presented and the matching pursuit generalized approximate message passing (MPGAMP) algorithm is modified. Then, a new joint recovery algorithm, referred to as the joint MPGAMP algorithm, is proposed. It sets up the jointly shared support-set of the signal ensemble with the support exploration ability of matching pursuit and recovers the signals' amplitudes on the support-set with the good reconstruction performance of GAMP. Numerical investigation shows that the joint MPGAMP algorithm provides performance improvements in DCS reconstruction compared to joint orthogonal matching pursuit, joint look ahead orthogonal matching pursuit and regular MPGAMP.

41-60hit(525hit)