The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] coprocessor(5hit)

1-5hit
  • Design and Implement of High Performance Crypto Coprocessor

    Shice NI  Yong DOU  Kai CHEN  Jie ZHOU  

     
    LETTER-Algorithms and Data Structures

      Vol:
    E97-A No:4
      Page(s):
    989-990

    This letter proposes a novel high performance crypto coprocessor that relies on Reconfigurable Cryptographic Blocks. We implement the prototype of the coprocessor on Xilinx FPGA chip. And the pipelining technique is adopted to realize data paralleling. The results show that the coprocessor, running at 189MHz, outperforms the software-based SSL protocol.

  • A K-Means-Based Multi-Prototype High-Speed Learning System with FPGA-Implemented Coprocessor for 1-NN Searching

    Fengwei AN  Tetsushi KOIDE  Hans Jürgen MATTAUSCH  

     
    PAPER-Biocybernetics, Neurocomputing

      Vol:
    E95-D No:9
      Page(s):
    2327-2338

    In this paper, we propose a hardware solution for overcoming the problem of high computational demands in a nearest neighbor (NN) based multi-prototype learning system. The multiple prototypes are obtained by a high-speed K-means clustering algorithm utilizing a concept of software-hardware cooperation that takes advantage of the flexibility of the software and the efficiency of the hardware. The one nearest neighbor (1-NN) classifier is used to recognize an object by searching for the nearest Euclidean distance among the prototypes. The major deficiency in conventional implementations for both K-means and 1-NN is the high computational demand of the nearest neighbor searching. This deficiency is resolved by an FPGA-implemented coprocessor that is a VLSI circuit for searching the nearest Euclidean distance. The coprocessor requires 12.9% logic elements and 58% block memory bits of an Altera Stratix III E110 FPGA device. The hardware communicates with the software by a PCI Express (4) local-bus-compatible interface. We benchmark our learning system against the popular case of handwritten digit recognition in which abundant previous works for comparison are available. In the case of the MNIST database, we could attain the most efficient accuracy rate of 97.91% with 930 prototypes, the learning speed of 1.310-4 s/sample and the classification speed of 3.9410-8 s/character.

  • Montgomery Multiplication with Twice the Bit-Length of Multipliers

    Masayuki YOSHINO  Katsuyuki OKEYA  Camille VUILLAUME  

     
    PAPER-Implementation

      Vol:
    E91-A No:1
      Page(s):
    203-210

    We present a novel approach for computing 2n-bit Montgomery multiplications with n-bit hardware Montgomery multipliers. Smartcards are usually equipped with such hardware Montgomery multipliers; however, due to progresses in factoring algorithms, the recommended bit length of public-key schemes such as RSA is steadily increasing, making the hardware quickly obsolete. Thanks to our double-size technique, one can re-use the existing hardware while keeping pace with the latest security requirements. Unlike the other double-size techniques which rely on classical n-bit modular multipliers, our idea is tailored to take advantage of n-bit Montgomery multipliers. Thus, our technique increases the perenniality of existing products without compromises in terms of security.

  • Synthesis of Application-Specific Coprocessor for Core-Based ASIC Design

    Dae-Hyun LEE  In-Cheol PARK  Chong-Min KYUNG  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E84-A No:2
      Page(s):
    604-613

    This paper presents an efficient approach for a hardware/software partitioning problem: synthesis of an application-specific coprocessor which accelerates an embedded software running on a main processor. Given a set of data flow graphs (DFGs), most of previous hardware/software partitioning approaches have focused on mapping DFGs to hardware or software. Their common weaknesses are that 1) they ignore various implementation alternatives in realizing DFGs as hardware based on the assumption that only a single hardware implementation exists for a DFG, and that 2) they don't consider the effect of merging on hardware area when synthesizing a coprocessor by merging DFGs. To deal with the first issue, we formulate both the mapping of DFGs to hardware or software and the selection of the appropriate hardware implementation for each DFG as a single integer programming problem, and then apply an iterative algorithm based on the Kernighan and Lin's heuristic to solve the problem. To reduce the CPU time, we have devised data structures that quickly calculate costs of hardware implementations. To deal with the second issue, our method links DFGs with dummy nodes to produce a single large DFG, and then synthesizes a target coprocessor by globally scheduling the DFG and allocating its datapath. Experimental results demonstrate that our approach outperforms the previous approach based on genetic algorithm (GA) in both the coprocessor area and the CPU time.

  • An Algorithm for the K-Selection Problem Using Special-Purpose Sorters

    Heung-Shik KIM  Jong-Soo PARK  Myunghwan KIM  

     
    PAPER-Algorithm and Computational Complexity

      Vol:
    E75-D No:5
      Page(s):
    704-708

    An algorithm is presented for selecting the k-th smallest element of a totally ordered (but not sorted) set of n elements, 1kn, in the case that a special-purpose sorter is used as a coprocessor. When the pipeline merge sorter is used as the special-purpose sorter, we analyze the comparison complexity of the algorithm for the given capacity of the sorter. The comparison complexity of the algorithm is 1.4167no(n), provided that the capacity of the sorter is 256 elements. The comparison complexity of the algorithm decreases as the capacity of the sorter increases.