IEICE global.ieice.org Site

Keyword Search Result

[Keyword] fpga(330hit)

261-280hit(330hit)

Design of a Field-Programmable Digital Filter Chip Using Multiple-Valued Current-Mode Logic
Katsuhiko DEGAWA Takafumi AOKI Tatsuo HIGUCHI

PAPER

Vol:
E86-A No:8
Page(s):
2001-2010
This paper presents a Field-Programmable Digital Filter (FPDF) IC that employs carry-propagation-free redundant arithmetic algorithms for faster computation and multiple-valued current-mode circuit technology for high-density low-power implementation. The original contribution of this paper is to evaluate, through actual chip fabrication, the potential impact of multiple-valued current-mode circuit technology on the reduction of hardware complexity required for DSP-oriented programmable ICs. The prototype FPDF fabrication with 0.6 µm CMOS technology demonstrates that the chip area and power consumption can be reduced to 41% and 71%, respectively, compared with the standard binary logic implementation.
Trade-Offs in Custom Circuit Designs for Subgraph Isomorphism Problems
Shuichi ICHIKAWA Hidemitsu SAITO Lerdtanaseangtham UDORN Kouji KONISHI

PAPER-VLSI Systems

Vol:
E86-D No:7
Page(s):
1250-1257
Many application programs can be modeled as a subgraph isomorphism problem. However, this problem is generally NP-complete and difficult to compute. A custom computing circuit is a prospective solution for such problems. This paper examines various accelerator designs for subgraph isomorphism problems based on Ullmann's algorithm and Konishi's algorithm. These designs are quantitatively evaluated from two points of view: logic scale and execution time. Our study revealed that Ullmann's design is faster but larger in logic scale. Partially sequential versions of Ullmann's algorithm can be more cost-effective than Ullmann's original design. The hardware of Konishi's algorithm is smaller in logic scale, operates at a higher frequency, and is more cost-effective.
An Efficient Exact Router for Hyper-Universal Switching Box
Jiping LIU Hongbing FAN Dinah de PORTO Yu-Liang WU

PAPER

Vol:
E86-A No:6
Page(s):
1430-1436
A Hyper-Universal Switch Box (HUSB) [1]-[3] can yield a feasible (detailed) routing solution for any given routing requirement of multi-pin nets or multi-point connections of surrounding terminals. This flexible routing structure obviously possesses multiple potential applications for re-configurable systems such as FPGAs and communication switching networks [4],[5]. Based on the same decomposition theory developed in the design scheme of such powerful switching structure, a simple routing algorithm can also be developed. The router is exact in terms of its assured capability in finding a routing solution, and it is efficient due to the divide and conquer nature and simple mapping scheme for pre-analyzed routing patterns saved in data base.
Accelerating the CKY Parsing Using FPGAs
Jacir L. BORDIM Yasuaki ITO Koji NAKANO

PAPER

Vol:
E86-D No:5
Page(s):
803-810
The main contribution of this paper is to present an FPGA-based implementation of an instance-specific hardware which accelerates the CKY (Cocke-Kasami-Younger) parsing for context-free grammars. Given a context-free grammar G and a string x, the CKY parsing determines whether G derives x. We have developed a hardware generator that creates a Verilog HDL source to perform the CKY parsing for any given context-free grammar G. The generated source is embedded in an FPGA using the design software provided by the FPGA vendor. We evaluated the instance-specific hardware, generated by our hardware generator, using a timing analyzer and tested it using the Altera FPGAs. The generated hardware attains a speed-up factor of approximately 750 over the software CKY parsing algorithm.
Data Dependent Circuit for Subgraph Isomorphism Problem
Shuichi ICHIKAWA Shoji YAMAMOTO

PAPER

Vol:
E86-D No:5
Page(s):
796-802
Although the subgraph isomorphism problem has various important applications, it is generally NP-complete and difficult to solve. Though a custom computing circuit can reduce the execution time substantially, it requires considerable hardware resources and is inapplicable to large problems. This paper examines the feasibility of data dependent designs, which are particularly suitable to a Field Programmable Gate Array (FPGA). The data dependent approach drastically reduces hardware requirements. For graphs of 32 vertices, the average logic scale of data dependent circuits is only 5% of the corresponding data independent circuit. The data dependent circuit is estimated to be maximally 460 times faster than the software. Even if the circuit generation time is included, a data dependent circuit is estimated to be 2.04 times faster than software for graphs of 32 vertices. The performance gain would increase for larger graphs.
An Image Retrieval System Using FPGAs
Koji NAKANO Etsuko TAKAMICHI

PAPER

Vol:
E86-D No:5
Page(s):
811-818
The main contribution of this paper is to present an image retrieval system using FPGAs. Given a template image T and a database of a number of images I1, I2,, our system lists all images that contain a subimage similar to T. More specifically, a hardware generator in our system creates the Verilog HDL source of a hardware that determines whether Ii has a similar subimage to T for any image Ii and a particular template T. The created Verilog HDL source is compiled and embedded in an FPGA using the design tool provided by the FPGA vendor. Since the hardware embedded in the FPGA is designed for a particular template T, it is an instance-specific hardware that allows us to achieve extreme acceleration. We evaluate the performance of our image matching hardware using a PCI-connected Xilinx FPGA and a timing analyzer. Since the generated hardware attains up to 3000 speed-up factor over the software solution, our approach is promising.
Time-Memory Trade-off Cryptanalysis for Limited Key on FPGA-Based Parallel Machine RASH
Katsumi TAKAHASHI Hiroai ASAMI Katsuto NAKAJIMA Masahiro IIDA

PAPER

Vol:
E86-D No:5
Page(s):
781-788
We designed an FPGA-based parallel machine called "RASH"(Reconfigurable Architecture based on Scalable Hardware) for high speed and flexible signal/data processing. Cryptanalysis is one of the killer applications for FPGA-based machines because huge amounts of logical and/or simple arithmetic operations are required and FPGA is suitable for this. One of the well-known activities in cryptanalysis is the DES (Data Encryption Standard) cracking contest conducted by RSA Data Security. TMTO (Time-Memory Trade-Off) Cryptanalysis is a practical method to dramatically shorten the time for key search when plaintext is given in advance. A string of ASCII characters is used as the key much like a password. The ASCII character is 7-bit character and is changed to 96 kinds of value. The 56-bit DES key is given with a string of 8 ASCII characters. Although the DES key has 64 trillion(=256) possibilities, the key that is given with a string has only 6.4 trillion(=968) possibilities. Therefore, we improve TMTO cryptanalysis so that we search only the limited key by ASCII characters and reduce the quantity of computation. In this paper, we demonstrate how TMTO cryptanalysis for limited key is well suited to our FPGA-based RASH machine. By limiting the key to a string, DES key will be found at 80% probability within 45 minutes after ciphertext is given on 10 units of RASH. The precomputation before starting key search takes 3 weeks on the same RASH configuration.
An Efficient Algorithm Finding Simple Disjoint Decompositions Using BDDs
Yusuke MATSUNAGA

PAPER-Logic Synthesis

Vol:
E85-A No:12
Page(s):
2715-2724
Functional decomposition is an essential technique of logic synthesis and is important especially for FPGA design. Bertacco and Damiani proposed an efficient algorithm finding simple disjoint decomposition using Binary Decision Diagrams (BDDs). However, their algorithm is not complete and does not find all the decompositions. This paper presents a complete theory of simple disjoint decomposition and describes an efficient algorithm using BDDs.
Look Up Table Compaction Based on Folding of Logic Functions
Shinji KIMURA Atsushi ISHII Takashi HORIYAMA Masaki NAKANISHI Hirotsugu KAJIHARA Katsumasa WATANABE

PAPER-Logic Synthesis

Vol:
E85-A No:12
Page(s):
2701-2707
The paper describes the folding method of logic functions to reduce the size of memories to keep the functions. The folding is based on the relation of fractions of logic functions. If the logic function includes 2 or 3 same parts, then only one part should be kept and other parts can be omitted. We show that the logic function of 1-bit addition can be reduced to half size using the bit-wise NOT relation and the bit-wise OR relation. The paper also introduces 3-1 LUT's with the folding mechanism. A full adder can be implemented using only one 3-1 LUT with the folding. Multi-bit AND and OR operations can be mapped to our LUT's not using the extra cascading circuit but using the carry circuit for addition. We have also tested the mapping capability of 4 input functions to our 3-1 LUT's with folding and carry propagation mechanisms. We have shown the reduction of the area consumption when using our LUT's compared to the case using 4-1 LUT's on several benchmark circuits.
Design of Jacobi EVD Processor Based on CORDIC for DOA Estimation with MUSIC Algorithm
Minseok KIM Koichi ICHIGE Hiroyuki ARAI

PAPER

Vol:
E85-B No:12
Page(s):
2648-2655
Computing the Eigen Value Decomposition (EVD) of a symmetric matrix is a frequently encountered problem in adaptive (or smart or software) antenna signal processing, for example, super resolution DOA (Direction Of Arrival) estimation algorithms such as MUSIC (MUltiple SIgnal Classification) and ESPRIT (Estimation of Signal Parameters via Rotational Invariance Technique). In this paper the hardware architecture of the fast EVD processor of symmetric correlation matrices for the application of an adaptive antenna technology such as DOA estimation is proposed and the basic idea is also presented. Cyclic Jacobi method is well known for the simplest algorithm and easily implemented but its convergence time is slower than other factorization algorithm like QR-method. But if considering the fast parallel computation of the EVD with a hardware architecture like ASIC (Application Specific Integrated Circuit) or FPGA (Field Programmable Gate Array), the Jacobi method can be a appropriate solution, since it offers a quite higher degree of parallelism and easier implementation than other factorization algorithms. This paper computes the EVD using a Jacobi-type method, where the vector rotations and the angles of the rotations are obtained by CORDIC (COordinate Rotation DIgital Computer). The hardware architecture suitable for ASIC or FPGA with fixed-point arithmetic is presented. Because it consists of only shift and add operations, this hardware friendly feature provides easy and efficient implementation. In this paper, the computational load, the estimate of circuit scale and expected performance are discussed and the validation of fixed-point arithmetic for the practical application to MUSIC DOA estimation is examined.
Random Number Generators Implemented with Neighborhood-of-Four, Non-locally Connected Cellular Automata
Barry SHACKLEFORD Motoo TANAKA Richard J. CARTER Greg SNIDER

PAPER-VLSI Design

Vol:
E85-A No:12
Page(s):
2612-2623
Studies of cellular automata (CA) based random number generators (RNGs) have focused mainly upon symmetrically connected networks with neighborhood sizes of three or five. Popular field programmable gate array configurations feature a four-input (i.e., 16-row) lookup table. Full utilization of the four-input lookup table leads to the potential for asymmetrically connected cellular automata networks with a neighborhood size of four. From each of various 1-d, 2-d, and 3-d networks with periodic boundary conditions, the 1000 highest entropy CA RNGs were selected from the set of 65,536 possible uniform (all CA truth tables the same) implementations. Each set of 1000 high-entropy CA was then submitted to Marsaglia's DIEHARD suite of random number tests. A number of 64-bit, neighbor-of-four CA-based RNGs have been discovered that pass all tests in DIEHARD without resorting to either site spacing or time spacing to improve the RNG quality.
Secure Download System Based on Software Defined Radio Composed of FPGAs
Hironori UCHIKAWA Kenta UMEBAYASHI Ryuji KOHNO

PAPER

Vol:
E85-B No:12
Page(s):
2601-2609
In this paper, we focus attention on the development of security techniques using software defined radio (SDR) technologies. We propose a new secure download system which uses the characteristics of the field programmable gate arrays (FPGAs) composing the SDR. The proposed system has the novelty that realization of high security encipherment is possible. This is achieved using the characteristic of FPGAs which allows systems to be arranged in a variety of different layouts, as well as by using the configuration information as the key. This unifies the renewal of the key and the encipherment. In addition the proposed system has the merit that it has high security against illegal acquisition such as a wiretapping, and can also be used in conjunction with any other current cipher algorithm. As an evaluation of the security, we show that the proposed system has high immunity to illegal acquisition of software using replay attack, by verification of the protocol as well as by numerical computation. The proposed system can therefore realize high security software downloads based on SDR.
Adaptive Burst M-QAM Modem Architecture for Broadband Wireless Applications
Daniel T. ASPEL David M. KLYMYSHYN

LETTER

Vol:
E85-B No:12
Page(s):
2760-2763
This paper presents an adaptive burst-mode M-QAM modem architecture suitable for variable rate broadband wireless packet data networks. The core signal processing functions for the modem are common to all constellations resulting in an efficient hardware architecture for field programmable gate array (FPGA) implementation.
Configurable and Reconfigurable Computing for Digital Signal Processing
Toshinori SUEYOSHI Masahiro IIDA

INVITED PAPER-LSI/Signal Processors

Vol:
E85-A No:3
Page(s):
591-599
Recent DSP applications have many significant issues such as higher system performance, lower power consumption, higher design flexibility, faster time-to-market, and so on. Neither a conventional ASIC nor a conventional DSP can necessarily satisfy all the requirements at once nowadays. Therefore, an alternate for DSP applications will be needed to complement the drawbacks of ASICs and DSPs. This paper introduces a new computing paradigm called configurable computing or reconfigurable computing, which has more potential in terms of performance and flexibility. Conventional silicon platforms will not satisfy the conflicting demands of standard products and customization. However, silicon platforms such as FPGAs for configurable or reconfigurable computing are standardized in manufacturing but customized in application. This paper also presents a brief survey of the existing silicon platforms that support configuration or reconfiguration in the application domain of digital signal processing such as image processing, communication processing, audio and speech processing. Finally, we show some promising reconfigurable architectures for the digital signal processing and discuss the future of reconfigurable computing.
Implementation of a High-Performance Genetic Algorithm Processor for Hardware Optimization
Jinjung KIM Yunho CHOI Chongho LEE Duckjin CHUNG

PAPER-Electronic Circuits

Vol:
E85-C No:1
Page(s):
195-203
In this paper, a hardware-oriented Genetic Algorithm (GA) was proposed in order to save the hardware resources and to reduce the execution time of GAP. Based on steady-state model among continuous generation model, the proposed GA used modified tournament selection, as well as special survival condition, with replaced whenever the offspring's fitness is better than worse-fit parent's. The proposed algorithm shows more than 30% in convergence speed over the conventional algorithm. Finally, by employing the efficient pipeline parallelization and handshaking protocol in proposed GAP, above 30% of the computation speed-up can be achieved over survival-based GA which runs one million crossovers per second (1 MHz), when device speed and size of application are taken into account on prototype. It would be used for high speed processing such of central processor of evolvable hardware, robot control and many optimization problems.
A General Framework to Use Various Decomposition Methods for LUT Network Synthesis
Shigeru YAMASHITA Hiroshi SAWADA Akira NAGOYA

PAPER-VLSI Design Technology and CAD

Vol:
E84-A No:11
Page(s):
2915-2922
This paper presents a new framework for synthesizing look-up table (LUT) networks. Some of the existing LUT network synthesis methods are based on one or two functional (Boolean) decompositions. Our method also uses functional decompositions, but we try to use various decomposition methods, which include algebraic decompositions. Therefore, this method can be thought of as a general framework for synthesizing LUT networks by integrating various decomposition methods. We use a cost database file which is a unique characteristic in our method. We also present comparisons between our method and some well-known LUT network synthesis methods, and evaluate the final results after placement and routing. Although our method is rather heuristic in nature, the experimental results are encouraging.
A Routability Driven Technology Mapping Algorithm for LUT Based FPGA Designs
Chi-Chou KAO Yen-Tai LAI

PAPER-FPGA Systhesis

Vol:
E84-A No:11
Page(s):
2690-2696
This paper presents a CAD technology mapping algorithm for LUT-based FPGAs. Since interconnections in an FPGA must be accomplished with limited routing resources, routability is the most important objective in a technology mapping algorithm. To optimize routability, the goal of the algorithm is the production of a design with a minimum interconnection. The Min-cut algorithm is first used to partition a graph representing a Boolean network into clusters so that the total number of interconnections between clusters is minimum. To decrease further the number of interconnections needed, clusters are then merged into larger clusters by a pairing technique. This algorithm has been tested on the MCNC benchmark circuits. Compared with other LUT-based FPGA mapping algorithms, the algorithm produces better routability characteristics.
The Evolutionary Algorithm-Based Reasoning System
Moritoshi YASUNAGA Ikuo YOSHIHARA Jung Hwan KIM

PAPER

Vol:
E84-D No:11
Page(s):
1508-1520
In this paper, we propose the evolutionary algorithm-based reasoning system and its design methodology. In the proposed design methodology, reasoning rules behind the past cases in each task (in each case database) are extracted through genetic algorithms and are expressed as truth tables (we call them 'evolved truth tables'). Circuits for the reasoning systems are synthesized from the evolved truth tables. Parallelism in each task can be embedded directly in the circuits by the hardware implementation of the evolved truth tables, so that the high speed reasoning system with small or acceptable hardware size is achieved. We developed a prototype system using Xilinx Virtex FPGA chips and applied it to the gene boundary reasoning (GBR) and English pronunciation reasoning (EPR), which are very important practical tasks in the genome science and language processing field, respectively. The GBR and the EPR prototype systems are evaluated in terms of the reasoning accuracy, circuit size, and processing speed, and compared with the conventional approaches in the parallel AI and the artificial neural networks. Fault injection experiments are also carried out using the prototype system, and its high fault-tolerance, or graceful degradation against defective circuits that suits to the hardware implementation using wafer scale LSIs is demonstrated.
The Kernel-Based Pattern Recognition System Designed by Genetic Algorithms
Moritoshi YASUNAGA Taro NAKAMURA Ikuo YOSHIHARA Jung Hwan KIM

PAPER

Vol:
E84-D No:11
Page(s):
1528-1539
We propose the kernel-based pattern recognition hardware and its design methodology using the genetic algorithm. In the proposed design methodology, pattern data are transformed into the truth tables and the truth tables are evolved to represent kernels in the discrimination functions for pattern recognition. The evolved truth tables are then synthesized to logic circuits. Because of this data direct implementation approach, no floating point numerical circuits are required and the intrinsic parallelism in the pattern data set is embedded into the circuits. Consequently, high speed recognition systems can be realized with acceptable small circuit size. We have applied this methodology to the image recognition and the sonar spectrum recognition tasks, and implemented them onto the newly developed FPGA-based reconfigurable pattern recognition board. The developed system demonstrates higher recognition accuracy and much faster processing speed than the conventional approaches.
Functional Decomposition with Application to LUT-Based FPGA Synthesis
Jian QIAO Kunihiro ASADA

PAPER-VLSI Design Technology and CAD

Vol:
E84-A No:8
Page(s):
2004-2013
In this paper, we deal with the problem of compatibility class encoding, and propose a novel algorithm for finding a good functional decomposition with application to LUT-based FPGA synthesis. Based on exploration of the design space, we concentrate on extracting a set of components, which can be merged into the minimum number of multiple-output CLBs or LUTs, such that the decomposition constructed from these components is also minimal. In particular, to explore more degrees of freedom, we introduce pliable encoding to take over the conventional rigid encoding when it fails to find a satisfactory decomposition by rigid encoding. Experimental results on a large set of MCNC91 logic synthesis benchmarks show that our method is quite promising.

261-280hit(330hit)

Keyword Search Result

[Keyword] fpga(330hit)

Design of a Field-Programmable Digital Filter Chip Using Multiple-Valued Current-Mode Logic

Trade-Offs in Custom Circuit Designs for Subgraph Isomorphism Problems

An Efficient Exact Router for Hyper-Universal Switching Box

Accelerating the CKY Parsing Using FPGAs

Data Dependent Circuit for Subgraph Isomorphism Problem

An Image Retrieval System Using FPGAs

Time-Memory Trade-off Cryptanalysis for Limited Key on FPGA-Based Parallel Machine RASH

An Efficient Algorithm Finding Simple Disjoint Decompositions Using BDDs

Look Up Table Compaction Based on Folding of Logic Functions

Design of Jacobi EVD Processor Based on CORDIC for DOA Estimation with MUSIC Algorithm

Random Number Generators Implemented with Neighborhood-of-Four, Non-locally Connected Cellular Automata

Secure Download System Based on Software Defined Radio Composed of FPGAs

Adaptive Burst M-QAM Modem Architecture for Broadband Wireless Applications

Configurable and Reconfigurable Computing for Digital Signal Processing

Implementation of a High-Performance Genetic Algorithm Processor for Hardware Optimization

A General Framework to Use Various Decomposition Methods for LUT Network Synthesis

A Routability Driven Technology Mapping Algorithm for LUT Based FPGA Designs

The Evolutionary Algorithm-Based Reasoning System

The Kernel-Based Pattern Recognition System Designed by Genetic Algorithms

Functional Decomposition with Application to LUT-Based FPGA Synthesis

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles