The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] (42807hit)

5741-5760hit(42807hit)

  • An Online Thermal-Pattern-Aware Task Scheduler in 3D Multi-Core Processors

    Chien-Hui LIAO  Charles H.-P. WEN  

     
    PAPER

      Vol:
    E100-A No:12
      Page(s):
    2901-2910

    Hotspots occur frequently in 3D multi-core processors (3D-MCPs), and they may adversely impact both the reliability and lifetime of a system. We present a new thermally constrained task scheduler based on a thermal-pattern-aware voltage assignment (TPAVA) to reduce hotspots in and optimize the performance of 3D-MCPs. By analyzing temperature profiles of different voltage assignments, TPAVA pre-emptively assigns different initial operating-voltage levels to cores for reducing temperature increase in 3D-MCPs. The proposed task scheduler consists of an on-line allocation strategy and a new voltage-scaling strategy. In particular, the proposed on-line allocation strategy uses the temperature-variation rates of the cores and takes into two important thermal behaviors of 3D-MCPs that can effectively minimize occurrences of hotspots in both thermally homogeneous and heterogeneous 3D-MCPs. Furthermore, a new vertical-grouping voltage scaling (VGVS) strategy that considers thermal correlation in 3D-MCPs is used to handle thermal emergencies. Experimental results indicate that, when compared to a previous online thermally constrained task scheduler, the proposed task scheduler can reduce hotspot occurrences by approximately 66% (71%) and improve throughput by approximately 8% (2%) in thermally homogeneous (heterogeneous) 3D-MCPs. These results indicate that the proposed task scheduler is an effective technique for suppressing hotspot occurrences and optimizing throughput for 3D-MCPs subject to thermal constraints.

  • A New Method of Translational Compensation for Spatial Precession Targets with Rotational Symmetry

    Rong CHEN  Cunqian FENG  Sisan HE  Yi RAO  

     
    LETTER-Analog Signal Processing

      Vol:
    E100-A No:12
      Page(s):
    3061-3066

    The extraction of micro-motion parameters is deeply influenced by the precision of estimation on translational motion parameters. Based on the periodicity of micro-motion, the quadratic polynomial fitting is carried out among range delays to align envelope. The micro-motion component of phase information is eliminated by conjugate multiplication after which the translational motion parameters are estimated. Then the translational motion is precisely compensated through the third order polynomial fitting. Results of simulation demonstrate that the algorithm put forward here can realize the precise compensation for translational motion parameters even under an environment with low signal noise ratio (SNR).

  • Sponsored Search Auction Considering Combinational Bids with Externalities

    Ryusuke IMADA  Katsuhide FUJITA  

     
    PAPER-Information Network

      Pubricized:
    2017/09/15
      Vol:
    E100-D No:12
      Page(s):
    2906-2914

    Sponsored search is a mechanism that shows the appropriate advertisements (ads) according to search queries. The orders and payments of ads are determined by the auction. However, the externalities which give effects to CTR and haven't been considered in some existing works because the mechanism with externalities has high computational cost. In addition, some algorithms which can calculate the approximated solution considering the externalities within the polynomial-time are proposed, however, it assumed that one bidder can propose only a single ad. In this paper, we propose the approximation allocation algorithm that one bidder can offer many ads considering externalities. The proposed algorithm employs the concept of the combinatorial auction in order to consider the combinational bids. In addition, the proposed algorithm can find the approximated allocation by the dynamic programming. Moreover, we prove the computational complexity and the monotonicity of the proposed mechanism, and demonstrate computational costs and efficiency ratios by changing the number of ads, slots and maximum bids. The experimental results show that the proposed algorithm can calculate 0.7-approximation solution even though the full search can't find solutions in the limited times.

  • A Cheating-Detectable (k, L, n) Ramp Secret Sharing Scheme

    Wataru NAKAMURA  Hirosuke YAMAMOTO  Terence CHAN  

     
    PAPER-Cryptography and Information Security

      Vol:
    E100-A No:12
      Page(s):
    2709-2719

    In this paper, we treat (k, L, n) ramp secret sharing schemes (SSSs) that can detect impersonation attacks and/or substitution attacks. First, we derive lower bounds on the sizes of the shares and random number used in encoding for given correlation levels, which are measured by the mutual information of shares. We also derive lower bounds on the success probabilities of attacks for given correlation levels and given sizes of shares. Next we propose a strong (k, L, n) ramp SSS against substitution attacks. As far as we know, the proposed scheme is the first strong (k, L, n) ramp SSSs that can detect substitution attacks of at most k-1 shares. Our scheme can be applied to a secret SL uniformly distributed over GF(pm)L, where p is a prime number with p≥L+2. We show that for a certain type of correlation levels, the proposed scheme can achieve the lower bounds on the sizes of the shares and random number, and can reduce the success probability of substitution attacks within nearly L times the lower bound when the number of forged shares is less than k. We also evaluate the success probability of impersonation attack for our schemes. In addition, we give some examples of insecure ramp SSSs to clarify why each component of our scheme is essential to realize the required security.

  • Spatially “Mt. Fuji” Coupled LDPC Codes

    Yuta NAKAHARA  Shota SAITO  Toshiyasu MATSUSHIMA  

     
    PAPER-Coding Theory and Techniques

      Vol:
    E100-A No:12
      Page(s):
    2594-2606

    A new type of spatially coupled low density parity check (SCLDPC) code is proposed. This code has two benefits. (1) This code requires less number of iterations to correct the erasures occurring through the binary erasure channel in the waterfall region than that of the usual SCLDPC code. (2) This code has lower error floor than that of the usual SCLDPC code. Proposed code is constructed as a coupled chain of the underlying LDPC codes whose code lengths exponentially increase as the position where the codes exist is close to the middle of the chain. We call our code spatially “Mt. Fuji” coupled LDPC (SFCLDPC) code because the shape of the graph representing the code lengths of underlying LDPC codes at each position looks like Mt. Fuji. By this structure, when the proposed SFCLDPC code and the original SCLDPC code are constructed with the same code rate and the same code length, L (the number of the underlying LDPC codes) of the proposed SFCLDPC code becomes smaller and M (the code lengths of the underlying LDPC codes) of the proposed SFCLDPC code becomes larger than those of the SCLDPC code. These properties of L and M enables the above reduction of the number of iterations and the bit error rate in the error floor region, which are confirmed by the density evolution and computer simulations.

  • Search for High-Rate Punctured Convolutional Codes through Transformed Identical Codes

    Sen MORIYA  Kana KIKUCHI  Hiroshi SASANO  

     
    PAPER-Coding Theory and Techniques

      Vol:
    E100-A No:12
      Page(s):
    2607-2614

    In this study, we consider techniques to search for high-rate punctured convolutional code (PCC) encoders by rearranging row vectors of identical-encoder generator matrices. One well-known method to obtain a good PCC encoder is to perform an exhaustive search of all candidates. However, this approach is time-intensive. An exhaustive search with a rate RG=1/2 original encoder requires a relatively short time, whereas that with an RG=1/3 or lower original encoder takes significantly longer. The encoders with lower-rate original encoders are expected to create better PCC encoders. Thus, this paper proposes a method that uses exhaustive search results with rate RG=1/2 original encoders, and rearranges row vectors of identical-encoder generator matrices to create PCCs with a lower rate original code. Further, we provide PCC encoders obtained by searches that utilize our method.

  • Interleaved Sequences of Geometric Sequences Binarized with Legendre Symbol of Two Types

    Kazuyoshi TSUCHIYA  Yasuyuki NOGAMI  Satoshi UEHARA  

     
    PAPER-Sequences

      Vol:
    E100-A No:12
      Page(s):
    2720-2727

    A pseudorandom number generator is widely used in cryptography. A cryptographic pseudorandom number generator is required to generate pseudorandom numbers which have good statistical properties as well as unpredictability. An m-sequence is a linear feedback shift register sequence with maximal period over a finite field. M-sequences have good statistical properties, however we must nonlinearize m-sequences for cryptographic purposes. A geometric sequence is a binary sequence given by applying a nonlinear feedforward function to an m-sequence. Nogami, Tada and Uehara proposed a geometric sequence whose nonlinear feedforward function is given by the Legendre symbol. They showed the geometric sequences have good properties for the period, periodic autocorrelation and linear complexity. However, the geometric sequences do not have the balance property. In this paper, we introduce geometric sequences of two types and show some properties of interleaved sequences of the geometric sequences of two types. These interleaved sequences have the balance property and double the period of the geometric sequences by the interleaved structure. Moreover, we show correlation properties and linear complexity of the interleaved sequences. A key of our observation is that the second type geometric sequence is the complement of the left shift of the first type geometric sequence by half-period positions.

  • A Necessary and Sufficient Condition of Supply and Threshold Voltages in CMOS Circuits for Minimum Energy Point Operation

    Jun SHIOMI  Tohru ISHIHARA  Hidetoshi ONODERA  

     
    PAPER

      Vol:
    E100-A No:12
      Page(s):
    2764-2775

    Scaling supply voltage (VDD) and threshold voltage (Vth) dynamically has a strong impact on energy efficiency of CMOS LSI circuits. Techniques for optimizing VDD and Vth simultaneously under dynamic workloads are thus widely investigated over the past 15 years. In this paper, we refer to the optimum pair of VDD and Vth, which minimizes the energy consumption of a circuit under a specific performance constraint, as a minimum energy point (MEP). Based on the simple transregional models of a CMOS circuit, this paper derives a simple necessary and sufficient condition for the MEP operation. The simple condition helps find the MEP of CMOS circuits. Measurement results using standard-cell based memories (SCMs) fabricated in a 65-nm process technology also validate the condition derived in this paper.

  • A Study on the Error Performance of Soft-Decision Decodings for Binary Linear Codes on a 4-Level Quantization over an AWGN Channel

    Takuya KUSAKA  

     
    PAPER-Coding Theory

      Vol:
    E100-A No:12
      Page(s):
    3016-3022

    In this paper, a study on the design and implementation of uniform 4-level quantizers for soft-decision decodings for binary linear codes is shown. Simulation results on quantized Viterbi decoding with a 4-level quantizer for the (64,42,8) Reed-Muller code show that the optimum stepsize, which is derived from the cutoff rate, gives an almost optimum error performance. In addition, the simulation results show that the case where the number of optimum codewords is larger than the one for a received sequence causes non-negligible degradation on error performance at high SN ratios of Eb/N0.

  • Framework and VLSI Architecture of Measurement-Domain Intra Prediction for Compressively Sensed Visual Contents

    Jianbin ZHOU  Dajiang ZHOU  Li GUO  Takeshi YOSHIMURA  Satoshi GOTO  

     
    PAPER

      Vol:
    E100-A No:12
      Page(s):
    2869-2877

    This paper presents a measurement-domain intra prediction coding framework that is compatible with compressive sensing (CS)-based image sensors. In this framework, we propose a low-complexity intra prediction algorithm that can be directly applied to measurements captured by the image sensor. We proposed a structural random 0/1 measurement matrix, embedding the block boundary information that can be extracted from the measurements for intra prediction. Furthermore, a low-cost Very Large Scale Integration (VLSI) architecture is implemented for the proposed framework, by substituting the matrix multiplication with shared adders and shifters. The experimental results show that our proposed framework can compress the measurements and increase coding efficiency, with 34.9% BD-rate reduction compared to the direct output of CS-based sensors. The VLSI architecture of the proposed framework is 9.1 Kin area, and achieves the 83% reduction in size of memory bandwidth and storage for the line buffer. This could significantly reduce both the energy consumption and bandwidth in communication of wireless camera systems, which are expected to be massively deployed in the Internet of Things (IoT) era.

  • A Bitwidth-Aware High-Level Synthesis Algorithm Using Operation Chainings for Tiled-DR Architectures

    Kotaro TERADA  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER

      Vol:
    E100-A No:12
      Page(s):
    2911-2924

    As application hardware designs and implementations in a short term are required, high-level synthesis is more and more essential EDA technique nowadays. In deep-submicron era, interconnection delays are not negligible even in high-level synthesis thus distributed-register and -controller architectures (DR architectures) have been proposed in order to cope with this problem. It is also profitable to take data-bitwidth into account in high-level synthesis. In this paper, we propose a bitwidth-aware high-level synthesis algorithm using operation chainings targeting Tiled-DR architectures. Our proposed algorithm optimizes bitwidths of functional units and utilizes the vacant tiles by adding some extra functional units to realize effective operation chainings to generate high performance circuits without increasing the total area. Experimental results show that our proposed algorithm reduces the overall latency by up to 47% compared to the conventional approach without area overheads by eliminating unnecessary bitwidths and adding efficient extra FUs for Tiled-DR architectures.

  • New Constructions of Multiple Binary ZCZ Sequence Sets with Inter-Set Zero Cross-Correlation Zone

    Tao LIU  Chengqian XU  Yubo LI  Xiaoyu CHEN  

     
    PAPER-Information Theory

      Vol:
    E100-A No:12
      Page(s):
    3007-3015

    In this correspondence, two types of multiple binary zero correlation zone (ZCZ) sequence sets with inter-set zero cross-correlation zone (ZCCZ) are constructed. Based on orthogonal matrices with order N×N, multiple binary ZCZ sequence sets with inter-set even and odd ZCCZ lengthes are constructed, each set is an optimal ZCZ sequence set with parameters (2N2, N, N+1)-ZCZ, among these ZCZ sequence sets, sequences possess ideal cross-correlation property within a zone of length 2Z or 2Z+1. These resultant multiple ZCZ sequence sets can be used in quasi-synchronous CDMA systems to remove the inter-cell interference (ICI).

  • An Efficient Weighted Bit-Flipping Algorithm for Decoding LDPC Codes Based on Log-Likelihood Ratio of Bit Error Probability

    Tso-Cho CHEN  Erl-Huei LU  Chia-Jung LI  Kuo-Tsang HUANG  

     
    PAPER-Fundamental Theories for Communications

      Pubricized:
    2017/05/29
      Vol:
    E100-B No:12
      Page(s):
    2095-2103

    In this paper, a weighted multiple bit flipping (WMBF) algorithman for decoding low-density parity-check (LDPC) codes is proposed first. Then the improved WMBF algorithm which we call the efficient weighted bit-flipping (EWBF) algorithm is developed. The EWBF algorithm can dynamically choose either multiple bit-flipping or single bit-flipping in each iteration according to the log-likelihood ratio of the error probability of the received bits. Thus, it can efficiently increase the convergence speed of decoding and prevent the decoding process from falling into loop traps. Compared with the parallel weighted bit-flipping (PWBF) algorithm, the EWBF algorithm can achieve significantly lower computational complexity without performance degradation when the Euclidean geometry (EG)-LDPC codes are decoded. Furthermore, the flipping criterion does not require any parameter adjustment.

  • Relay Assignment for Energy Harvesting Cooperative Communication Systems with Long-Term CSI and Energy Side Information

    Feng KE  Yue ZHANG  Yuanyi DENG  Yuehua DING  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2017/06/19
      Vol:
    E100-B No:12
      Page(s):
    2139-2146

    A relay assignment scheme is proposed in this paper that minimizes the mean delay of transmission for energy harvesting (EH) cooperative communication systems, whose source node and relay nodes are all equipped with energy harvesters. We jointly consider the long-term channel side information (CSI) and energy side information (ESI) of all nodes, and formulate the delay minimization problem as an integer programming problem. To solve this problem, a refined cyclic coordinate method (RCCM) is proposed that considers the cases of fixed-packet-length (FPL) and variable-packet-length (VPL) transmission. Simulation results show that the proposed scheme achieves performance close to that of the real-time relay selection (RRS) scheme with instantaneous CSI and ESI, which gives upper bound of the performance. Moreover, compared with the simple relay rotation (SRR) scheme where each relay has equal service time, the performance of the proposed scheme is significantly improved.

  • Robustness Evaluation of Restricted Boltzmann Machine against Memory and Logic Error

    Yasushi FUKUDA  Zule XU  Takayuki KAWAHARA  

     
    BRIEF PAPER-Integrated Electronics

      Vol:
    E100-C No:12
      Page(s):
    1118-1121

    In an IoT system, neural networks have the potential to perform advanced information processing in various environments. To clarify this, the robustness of a restricted Boltzmann machine (RBM) used for deep neural networks, such as a deep belief network (DBN), was studied in this paper. Even if memory or logic errors occurred in the circuit operating in the RBM while pre-training the DBN, they did not affect the identification rate of the DBN, showing the robustness of the RBM. In addition, robustness against soft errors was evaluated. The soft errors had almost no influence on the RBM unless they were as large as 1012 times or more in the 50-nm CMOS process.

  • Energy-Performance Modeling of Speculative Checkpointing for Exascale Systems

    Muhammad ALFIAN AMRIZAL  Atsuya UNO  Yukinori SATO  Hiroyuki TAKIZAWA  Hiroaki KOBAYASHI  

     
    PAPER-High performance computing

      Pubricized:
    2017/07/14
      Vol:
    E100-D No:12
      Page(s):
    2749-2760

    Coordinated checkpointing is a widely-used checkpoint/restart protocol for fault-tolerance in large-scale HPC systems. However, this protocol will involve massive amounts of I/O concentration, resulting in considerably high checkpoint overhead and high energy consumption. This paper focuses on speculative checkpointing, a CPR mechanism that allows for temporal distribution of checkpointings to avoid I/O concentration. We propose execution time and energy models for speculative checkpointing, and investigate energy-performance characteristics when speculative checkpointing is adopted in exascale systems. Using these models, we study the benefit of speculative checkpointing over coordinated checkpointing under various realistic scenarios for exascale HPC systems. We show that, compared to coordinated checkpointing, speculative checkpointing can achieve up to a 11% energy reduction at the cost of a relatively-small increase in the execution time. In addition, a significant energy-performance trade-off is expected when the system scale exceeds 1.2 million nodes.

  • Power-Effective File Layout Based on Large Scale Data-Intensive Application in Virtualized Environment

    Shunsuke YAGAI  Masato OGUCHI  Miyuki NAKANO  Saneyasu YAMAGUCHI  

     
    PAPER-Database system

      Pubricized:
    2017/07/14
      Vol:
    E100-D No:12
      Page(s):
    2761-2770

    In data centers, large numbers of computers are run simultaneously. These computers consume an enormous amount of energy. Several challenges related to this issue have been published. An energy-efficient storage management method that cooperates with applications was one effective approach. In this method, data and storage devices are managed using application support and the power consumption of storage devices is significantly decreased. However, existing studies do not take the virtualized environment into account. Recently, many data-intensive applications have been run in a virtualized environment, such as the cloud computing environment. In this paper, we focus on a virtualized environment wherein multiple virtual machines run on a physical computer and a data intensive application runs on each virtual machine. We discuss a method for reducing storage device power consumption using application support. First, we propose two storage management methods using application information. One method optimizes the inter-HDD file layout. This method removes frequently-accessed files from a certain HDD and switches the HDD to power-off mode. To balance loads and reduce seek distances, this method separates a heavily accessed file and consolidates files in a virtual machine with low access frequency. The other method optimizes the intra-HDD file layout, in addition to performing inter-HDD optimization. This method places frequently accessed files near each other. Second, we present our experimental results and demonstrate that the proposed methods can create sufficiently long HDD access intervals that power-off mode can be used, and thereby, reduce the power consumption of storage devices.

  • Protocol-Aware Packet Scheduling Algorithm for Multi-Protocol Processing in Multi-Core MPL Architecture

    Runzi ZHANG  Jinlin WANG  Yiqiang SHENG  Xiao CHEN  Xiaozhou YE  

     
    PAPER-Architecture

      Pubricized:
    2017/07/14
      Vol:
    E100-D No:12
      Page(s):
    2837-2846

    Cache affinity has been proved to have great impact on the performance of packet processing applications on multi-core platforms. Flow-based packet scheduling can make the best of data cache affinity with flow associated data and context structures. However, little work on packet scheduling algorithms has been conducted when it comes to instruction cache (I-Cache) affinity in modified pipelining (MPL) architecture for multi-core systems. In this paper, we propose a protocol-aware packet scheduling (PAPS) algorithm aiming at maximizing I-Cache affinity at protocol dependent stages in MPL architecture for multi-protocol processing (MPP) scenario. The characteristics of applications in MPL are analyzed and a mapping model is introduced to illustrate the procedure of MPP. Besides, a stage processing time model for MPL is presented based on the analysis of multi-core cache hierarchy. PAPS is a kind of flow-based packet scheduling algorithm and it schedules flows in consideration of both application-level protocol of flows and load balancing. Experiments demonstrate that PAPS outperforms the Round Robin algorithm and the HRW-based (HRW) algorithm for MPP applications. In particular, PAPS can eliminate all I-Cache misses at protocol dependent stage and reduce the average CPU cycle consumption per packet by more than 10% in comparison with HRW.

  • Neuromorphic Hardware Accelerated Lane Detection System

    Shinwook KIM  Tae-Gyu CHANG  

     
    LETTER-Architecture

      Pubricized:
    2017/07/14
      Vol:
    E100-D No:12
      Page(s):
    2871-2875

    This letter describes the development and implementation of the lane detection system accelerated by the neuromorphic hardware. Because the neuromorphic hardware has inherently parallel nature and has constant output latency regardless the size of the knowledge, the proposed lane detection system can recognize various types of lanes quickly and efficiently. Experimental results using the road images obtained in the actual driving environments showed that white and yellow lanes could be detected with an accuracy of more than 94 percent.

  • A Layout-Oriented Routing Method for Low-Latency HPC Networks

    Ryuta KAWANO  Hiroshi NAKAHARA  Ikki FUJIWARA  Hiroki MATSUTANI  Michihiro KOIBUCHI  Hideharu AMANO  

     
    PAPER-Interconnection networks

      Pubricized:
    2017/07/14
      Vol:
    E100-D No:12
      Page(s):
    2796-2807

    End-to-end network latency has become an important issue for parallel application on large-scale high performance computing (HPC) systems. It has been reported that randomly-connected inter-switch networks can lower the end-to-end network latency. This latency reduction is established in exchange for a large amount of routing information. That is, minimal routing on irregular networks is achieved by using routing tables for all destinations in the networks. In this work, a novel distributed routing method called LOREN (Layout-Oriented Routing with Entries for Neighbors) to achieve low-latency with a small routing table is proposed for irregular networks whose link length is limited. The routing tables contain both physically and topologically nearby neighbor nodes to ensure livelock-freedom and a small number of hops between nodes. Experimental results show that LOREN reduces the average latencies by 5.8% and improves the network throughput by up to 62% compared with a conventional compact routing method. Moreover, the number of required routing table entries is reduced by up to 91%, which improves scalability and flexibility for implementation.

5741-5760hit(42807hit)