IEICE global.ieice.org Site

Keyword Search Result

[Keyword] FPGA(329hit)

241-260hit(329hit)

Gram-Schmidt M-Wave Canceller for the EMG Controlled FES
Hojoon YEOM Youngcheol PARK Hyoungro YOON

LETTER-Rehabilitation Engineering and Assistive Technology

Vol:
E88-D No:9
Page(s):
2213-2217
To use the voluntary electromyogram (EMG) as a control signal of the EMG controlled functional electrical stimulator (FES), it is required to reduce the stimulation artifact and non-voluntary contribution (M-wave). In this study, a Gram-Schmidt (GS) prediction error filter (PEF) that can effectively eliminates the M-wave from voluntary EMG is presented. Also, the presented GS PEF is implemented on the field the programmable gate array (FPGA) for real-time processing and the performance is tested with simulated and real signals. Experimental results showed that GS-PEF was effective in reducing M-wave and preserving voluntary EMG.
A Novel FPGA Architecture and an Integrated Framework of CAD Tools for Implementing Applications
Konstantinos SIOZIOS George KOUTROUMPEZIS Konstantinos TATAS Nikolaos VASSILIADIS Vasilios KALENTERIDIS Haroula POURNARA Ilias PAPPAS Dimitrios SOUDRIS Antonios THANAILAKIS Spiridon NIKOLAIDIS Stilianos SISKOS

PAPER-Programmable Logic, VLSI, CAD and Layout

Vol:
E88-D No:7
Page(s):
1369-1380
A complete system for the implementation of digital logic in a Field-Programmable Gate Array (FPGA) platform is introduced. The novel power-efficient FPGA architecture was designed and simulated in STM 0.18 µm CMOS technology. The detailed design and circuit characteristics of the Configurable Logic Block, the interconnection network, the switch box and the connection box were determined and evaluated in terms of energy, delay and area. A number of circuit-level low-power techniques were employed because power consumption was the primary concern. Additionally, a complete tool framework for the implementation of digital logic circuits in FPGA platforms is introduced. Having as input VHDL description of an application, the framework derives the reconfiguration bitstream of FPGA. The framework consists of: i) non-modified academic tools, ii) modified academic tools and iii) new tools. Furthermore, the framework can support a variety of FPGA architectures. Qualitative and quantitative comparisons with existing academic and commercial architectures and tools are provided, yielding promising results.
Hardware n Choose k Counters with Applications to the Partial Exhaustive Search
Koji NAKANO Youhei YAMAGISHI

PAPER-Programmable Logic, VLSI, CAD and Layout

Vol:
E88-D No:7
Page(s):
1350-1359
The main contribution of this work is to present several hardware implementations of an "n choose k" counter (C(n,k) counter for short), which lists all n-bit numbers with (n-k) 0's and k 1's, and to show their applications. We first present concepts of C(n,k) counters and their efficient implementations on an FPGA. We then go on to evaluate their performance in terms of the number of used slices and the clock frequency for the Xilinx VirtexII family FPGA XC2V3000-4. As one of the real life applications, we use a C(n,k) counter to accelerate a digital halftoning method that generates a binary image reproducing an original gray-scale image. This method repeatedly replaces an image pattern in small square regions of a binary image by the best one. By the partial exhaustive search using a C(n,k) counter we succeeded in accelerating the task of finding the best image pattern and achieved a speedup factor of more than 2.5 over the simple exhaustive search.
Header Extraction and Control for an Asynchronous Optical Packet Switch Based on DPSK Decoding
Dimitrios KLONIDIS Christina T. POLITI Reza NEJABATI Mike J. O'MAHONY Dimitra SIMEONIDOU

PAPER-Optical Network Architecture

Vol:
E88-B No:5
Page(s):
1876-1883
A novel optical header extraction scheme based on optical differential phase shift keying--DPSK--decoding is examined analytically and experimentally. The header is applied in front of the payload, on the phase of a pulsed optical level introduced for the duration of the header. The proposed scheme offers maximized header extraction efficiency, required by the electronics to identify the header bits and control the switch. At the same time, the payload is transmitted at maximum extinction ratio. Analytical results prove the enhanced performance of the decoding scheme with respect to the extinction ratio and in comparison to other DPSK based schemes. Moreover, the utilised scheme is cost efficient and easily upgradeable to any bit rates and adds minimum complexity at the transmitter and detector parts of the system. Finally, the implementation of the developed technique in a real optical packet switch is demonstrated, where header extraction, reading, processing and switch control using field programmable gate array--FPGA--technology is successfully demonstrated.
Hybrid Pattern BIST for Low-Cost Core Testing Using Embedded FPGA Core
Gang ZENG Hideo ITO

PAPER-Dependable Computing

Vol:
E88-D No:5
Page(s):
984-992
In the Reconfigurable System-On-a-Chip (RSOC), an FPGA core is embedded to improve the design flexibility of SOC. In this paper, we demonstrate that the embedded FPGA core is also feasible for use in implementing the proposed hybrid pattern Built-In Self-Test (BIST) in order to reduce the test cost of SOC. The hybrid pattern BIST, which combines Linear Feedback Shift Register (LFSR) with the proposed on-chip Deterministic Test Pattern Generator (DTPG), can achieve not only complete Fault Coverage (FC) but also minimum test sequence by applying a selective number of pseudorandom patterns. Furthermore, the hybrid pattern BIST is designed under the resource constraint of target FPGA core so that it can be implemented on any size of FPGA core and take full advantage of the target FPGA resource to reduce test cost. Moreover, the reconfigurable core-based approach has minimum hardware overhead since the FPGA core can be reconfigured as normal mission logic after testing such that it eliminates the hardware overhead of BIST logic. Experimental results for ISCAS 89 benchmarks and a platform FPGA chip have proven the efficiency of the proposed approach.
SPFD-Based Flexible Transformation of LUT-Based FPGA Circuits
Katsunori TANAKA Shigeru YAMASHITA Yahiko KAMBAYASHI

PAPER-VLSI Design Technology and CAD

Vol:
E88-A No:4
Page(s):
1038-1046
In this paper, we present the condition for the effective wire addition in Look-Up-Table-based (LUT-based) field programmable gate array (FPGA) circuits, and an optimization procedure utilizing the effective wire addition. Each wire has different characteristics, such as delay and power dissipation. Therefore, the replacement of one critical wire for the circuit performance with many non-critical ones, i.e., many-addition-for-one-removal (m-for-1) is sufficiently useful. However, the conventional logic optimization methods based on sets of pairs of functions to be distinguished (SPFDs) for LUT-based FPGA circuits do not make use of the m-for-1 manipulation, and perform only simple replacement and removal, i.e., the one-addition-for-one-removal (1-for-1) manipulation and the no-addition-for-one-removal (0-for-1) manipulation, respectively. Since each LUT can realize an arbitrary internal function with respect to a specified number of input variables, there is no sufficient condition at the logic design level for simple wire addition. Moreover, in general, simple addition of a wire has no effects for removal of another wire, and it is important to derive the condition for non-simple and effective wire addition. We found the SPFD-based condition that wire addition is likely to make another wire redundant or replaceable, and developed an optimization procedure utilizing this effective wire addition. According to the experimental results, when we focused on the delay reduction of LUT-based FPGA circuits, our method reduced the delay by 24.2% from the initial circuits, while the conventional SPFD-based logic optimization and the enhanced global rewiring reduced it by 14.2% and 18.0%, respectively. Thus, our method presented in this paper is sufficiently practical, and is expected to improve the circuit performance.
FPGAs with Multidimensional Switch Topology
Yohei MATSUMOTO Akira MASAKI

LETTER-VLSI Systems

Vol:
E88-D No:4
Page(s):
775-778
This manuscript proposes an FPGA by embedding a multidimensional switch topology onto a two-dimensional chip. We show, using Rent's Rule, that this procedure reduces the number of switches. Then we propose the actual procedure and demonstrate that this does not increase metal wire density critically.
FPGA-Based Reconfigurable Adaptive FEC
Kazunori SHIMIZU Jumpei UCHIDA Yuichiro MIYAOKA Nozomu TOGAWA Masao YANAGISAWA Tatsuo OHTSUKI

PAPER-System Level Design

Vol:
E87-A No:12
Page(s):
3036-3046
In this paper, we propose a reconfigurable adaptive FEC system. In adaptive FEC schemes, the error correction capability t is changed dynamically according to the communication channel condition. If a particular error correction capability t is given, we can implement an FEC decoder which is optimal for t by taking the number of operations into consideration. Thus, reconfiguring the optimal FEC decoder dynamically for each error correction capability allows us to maximize the throughput of each decoder within a limited hardware resource. Based on this concept, our reconfigurable adaptive FEC system can reduce the packet dropping rate more efficiently than conventional fixed hardware systems. We can improve data transmission throughput for a reliable transport protocol. Practical simulation results are also shown.
Field-Programmable VLSI Based on a Bit-Serial Fine-Grain Architecture
Masanori HARIYAMA Weisheng CHONG Michitaka KAMEYAMA

PAPER

Vol:
E87-C No:11
Page(s):
1897-1902
This paper presents a novel architecture to solve two problems of existing FPGAs : the large delay and area due to complex programmable switch blocks, and the large area due to coarse-grain logic blocks that are underutilized to a great degree. A mesh-connected cellular array based on a bit-serial pipeline architecture is introduced to minimize complexity of switch blocks. A fine-grain logic block architecture with a functionality of a bit-serial adder is presented to minimize the number of inputs and outputs of the logic block since increase in the number of inputs and outputs directly increases the complexity of a switch block. For an area-efficient design, the logic block is implemented based on a hybrid of a programmable logic gate and a dedicated carry logic. The hybrid architecture allows us to use a small lookup table to implement the logic gate. Moreover, the carry logic uses a functional pass-gate that merges both logic and storage functions compactly. The performance of the fine-grain field-programmable VLSI (FPVLSI) is evaluated to be more than 2 times higher than that of a coarse-grain FPVLSI.
Implementation of FPGA Based Fast Unitary MUSIC DOA Estimator
Minseok KIM Koichi ICHIGE Hiroyuki ARAI

PAPER-Wireless Network System Performances

Vol:
E87-C No:9
Page(s):
1485-1494
DOA (Direction Of Arrival) estimation is a useful technique in various positioning applications including the DOA-based adaptive array antenna system. This paper presents a practical implementation of FPGA (Field Programmable Gate Array) based fast DOA estimator for wireless cellular basestation. This system incorporates spectral unitary MUSIC (MUltiple SIgnal Classification) algorithm, which is one of the representative super resolution DOA estimation techniques. This paper proposes a way of digital signal processor design suitable for FPGA and its real hardware implementation. In this system, all digital signal processing procedures are computed by the only fixed-point operation with finite word-length for fast processing and low power consumption. The performance will be assessed by hardware level simulations and experiments in a radio anechoic chamber.
The Design and Evaluation of Data-Dependent Hardware for Subgraph Isomorphism Problem
Shoji YAMAMOTO Shuichi ICHIKAWA Hiroshi YAMAMOTO

PAPER-Recornfigurable Systems

Vol:
E87-D No:8
Page(s):
2038-2047
Subgraph isomorphism problems have various important applications, while generally being NP-complete. Though Ullmann and Konishi proposed the custom circuit designs to accelerate subgraph isomorphism problem, they require many hardware resources for large problems. This study describes the design of data-dependent circuits for subgraph isomorphism problem with evaluation results on an actual FPGA platform. Data-dependent circuits are logic circuits specialized in specific input data. Such circuits are smaller and faster than the original circuit, although it is not reusable and involves circuit generation for each input. In the present study, the circuits were implemented on Xilinx XC2V3000 FPGA, and they successfully operated at a clock frequency 25 MHz. In the case of graphs with 16 vertices, the average execution time is about 7.0% of the software executed on an up-to-date microprocessor (Athlon XP 2600+ of 2.1 GHz clock). Even if the circuit generation time is included, data-dependent circuits are about 14.4 times faster than the software (for random graphs with 16 vertices). This performance advantage becomes larger for larger graphs. Two algorithms (Ullmann's and Konishi's) were examined, and the data-dependent approach was found to be equally effective for both algorithms. We also examined two types of input graph sets, and found that the data-dependent approach shows advantage in both cases.
A Real-Time Image Compressor Using 2-Dimensional DWT and Its FPGA Implementation
Young-Ho SEO Wang-Hyun KIM Ji-Sang YOO Dai-Gyoung KIM Dong-Wook KIM

PAPER-VLSI Design Technology and CAD

Vol:
E87-A No:8
Page(s):
2110-2119
This paper proposes the design and implementation of a real-time image compressor using 2-Dimensional Discrete Wavelet Transform (2DDWT), which targets an FPGA as its platform. The image compressor uses Daubechies' bi-orthogonal DWT filters (9, 7) and 16-bit fixed-point data formats for wavelet coefficients in the internal calculation. The target image is NTSC 640240 pixels per field whose color format is Y:Cb:Cr = 4:2:2. We developed for the 2DDWT a new structure with four Multipliers and Accumulators (MACs) for real-time operations. We designed and used a linear fixed scalar quantizer, which includes the exceptional treatment of the coefficients whose absolute values are larger than the quantization region. Only a Huffman entropy encoder was included due to the hardware overhead. The quantizer and Huffman encoder merged into a single functional module. Due to the insufficient memory space of an FPGA, we utilized external memory (SDRAM) as the working and memory storage space. The proposed image compressor maps into an APEX20KC EP20K600CB652-7 from Altera and uses 45% of the Logic Array Block (LAB) and 9% of the Embedded System Block (ESB). With a 33 MHz clock frequency, the proposed image compressor shows a speed of 67 fields per second (33 frames per second), which is more than real-time operation. The resulting image quality from reconstruction is approximately 28 dB in PSNR and its compression ratio is 29:1. Consequently, the proposed image compressor is expected to be used in a dedicated system requiring an image-processing unit.
Preliminary Evaluation of Flex Power FPGA: A Power Reconfigurable Architecture with Fine Granularity
Takashi KAWANAMI Masakazu HIOKI Hiroshi NAGASE Toshiyuki TSUTSUMI Tadashi NAKAGAWA Toshihiro SEKIGAWA Hanpei KOIKE

PAPER-Recornfigurable Systems

Vol:
E87-D No:8
Page(s):
2004-2010
The Flex Power FPGA is presented as a novel FPGA model offering the ability to configure the trade-off between power consumption and speed for each logic element by adjusting the threshold voltage. This FPGA model targets the reduction of static power consumption, which has become one of the most important issues in the development of future-generation devices. The present paper describes a preliminary simulation study of the Flex Power FPGA. A method to effectively assign threshold voltages to transistors at a prescribed granularity based on a timing analysis of the mapped circuit is implemented using the VPR simulator, and the static power reduction for 70 nm technologies is estimated using MCNC benchmark circuits. Simulation results show that the average static power can be reduced to as little as 1/30 of that in the corresponding conventional FPGA. This FPGA model is also demonstrated to be effective with future technologies, where the proportion of static power will be greater.
Self-Reconfigurable Multi-Layer Neural Networks with Genetic Algorithms
Eiko SUGAWARA Masaru FUKUSHI Susumu HORIGUCHI

PAPER-Recornfigurable Systems

Vol:
E87-D No:8
Page(s):
2021-2028
This paper addresses the issue of reconfiguring multi-layer neural networks implemented in single or multiple VLSI chips. The ability to adaptively reconfigure network configuration for a given application, considering the presence of faulty neurons, is a very valuable feature in a large scale neural network. In addition, it has become necessary to achieve systems that can automatically reconfigure a network and acquire optimal weights without any help from host computers. However, self-reconfigurable architectures for neural networks have not been studied sufficiently. In this paper, we propose an architecture for a self-reconfigurable multi-layer neural network employing both reconfiguration with spare neurons and weight training by GAs. This proposal offers the combined advantages of low hardware overhead for adding spare neurons and fast weight training time. To show the possibility of self-reconfigurable neural networks, the prototype system has been implemented on a field programmable gate array.
An FPGA-Based Acceleration Method for Metabolic Simulation
Yasunori OSANA Tomonori FUKUSHIMA Masato YOSHIMI Hideharu AMANO

PAPER-Recornfigurable Systems

Vol:
E87-D No:8
Page(s):
2029-2037
Computer simulation of cellular process is one of the most important applications in bioinformatics. Since such simulators need huge computational resources, many biologists must use expensive PC/WS clusters. ReCSiP is an FPGA-based, reconfigurable accelerator which aims to realize economical high-performance simulation environment on desktop computers. It can exploit fine-grain parallelism in the target applications by small hardware modules in the FPGA which work in parallel manner. As the first step to implement a simulator of cellular process on ReCSiP, a solver to perform a basic simulation of metabolism was implemented. The throughput of the solver was about 29 times faster than the software on Intel's PentiumIII operating at 1.13 GHz.
An Acceleration Processor for Data Intensive Scientific Computing
Cheong Ghil KIM Hong-Sik KIM Sungho KANG Shin Dug KIM Gunhee HAN

PAPER-Scientific and Engineering Computing with Applications

Vol:
E87-D No:7
Page(s):
1766-1773
Scientific computations for diffusion equations and ANNs (Artificial Neural Networks) are data intensive tasks accompanied by heavy memory access; on the other hand, their computational complexities are relatively low. Thus, this type of tasks naturally maps onto SIMD (Single Instruction Multiple Data stream) parallel processing with distributed memory. This paper proposes a high performance acceleration processor of which architecture is optimized for scientific computing using diffusion equations and ANNs. The proposed architecture includes a customized instruction set and specific hardware resources which consist of a control unit (CU), 16 processing units (PUs), and a non-linear function unit (NFU) on chip. They are effectively connected with dedicated ring and global bus structure. Each PU is equipped with an address modifier (AM) and 16-bit 1.5 k-word local memory (LM). The proposed processor can be easily expanded by multi-chip expansion mode to accommodate to a large scale parallel computation. The prototype chip is implemented with FPGA. The total gate count is about 1 million with 530, 432-bit embedded memory cells and it operates at 15 MHz. The functionality and performance of the proposed processor is verified with simulation of oil reservoir problem using diffusion equations and character recognition application using ANNs. The execution times of two applications are compared with software realizations on 1.7 GHz Pentium IV personal computer. Though the proposed processor architecture and the instruction set are optimized for diffusion equations and ANNs, it provides flexibility to program for many other scientific computation algorithms.
FPGA Design of Real-Time Watermarking Processor for 2DDWT-Based Video Compression
Young-Ho SEO Dong-Wook KIM

PAPER

Vol:
E87-A No:6
Page(s):
1297-1304
This paper proposed a new watermarking algorithm and implementation in hardware, by which the watermarking process and an image compression process can operate in conjunction, in parallel, and/or without degrading the performance of the compression process. The goal of the proposed watermarking scheme is to provide the bases to insist the ownership and to authenticate integrity of the watermark-embedded image by detecting the errors and their positions without the original image (blind watermarking). Our watermarking scheme is to replace the watermark with one or several bit-plane(s) of the DC subband after 2DDWT (2-Dimensional Discrete Wavelet Transform) decomposition which is the basic transformation in DWT-based image compression such as JPEG2000. If more than one bit-plane is involved, the position to embed each watermark bit is randomly selected among the bit-planes by a random number generated with an LFSR (Linear Feedback Shift Register). Experimental results showed that for all the considered attacks except the high compression by JPEG, the error ratios in the extracted watermarks by our algorithm were below 3% and the extracted watermarks were unambiguously recognizable in all the cases. The hardware (FPGA)-implemented result could operate stably in 82 MHz clock frequency. This hardware was merged to DWT-based image compression codec which runs in a real-time in 66 MHz of clock frequency. This resulted in the real-time operation for codec and watermarking together in 66 MHz of clock frequency. The watermarking scheme used 4,037 LABs (24%) of the hardware resource of APEX20KC EP20K400CF672-7 from Altera.
An Equalization Technique for 54 Mbps OFDM Systems
Naihua YUAN Anh DINH Ha H. NGUYEN

PAPER-Communication Theory and Systems

Vol:
E87-A No:3
Page(s):
610-618
A time-domain equalization (TEQ) algorithm is presented to shorten the effective channel impulse response to increase the transmission efficiency of the 54 Mbps IEEE 802.11a orthogonal frequency division multiplexing (OFDM) system. In solving the linear equation Aw = B for the optimum TEQ coefficients, A is shown to be Hermitian and positive definite. The LDLT and LU decompositions are used to factorize A to reduce the computational complexity. Simulation results show high performance gains at a data rate of 54 Mbps with moderate orders of TEQ finite impulse response (FIR) filter. The design and implementation of the algorithm in field programmable gate array (FPGA) are also presented. The regularities among the elements of A are exploited to reduce hardware complexity. The LDLT and LU decompositions are combined in hardware design to find the TEQ coefficients in less than 4 µs. To compensate the effective channel impulse response, a radix-4 pipeline fast Fourier transform (FFT) is implemented in performing zero forcing equalization. The hardware implementation information is provided and simulation results are compared to mathematical values to verify the functionalities of the chips running at 54 Mbps.
Feasibility Study on Over-the-Air Software Download for Software-Radio-Based Intelligent Transport Systems
Hiroshi HARADA Masayuki FUJISE

PAPER

Vol:
E86-B No:12
Page(s):
3425-3432
We have proposed two types of software download methods for software radio (SR) based intelligent transport systems (ITS): (1) broadcasting-type software download method and (2) communication-type software download method. In this paper, we study their feasibility of their employment in a newly developed prototype. We give tangible examples of method (1) using the vehicle information and communication system (VICS) and method (2) using the dedicated short range communication (DSRC) system. We describe the download formats and procedures for both methods and use the experimental prototype to evaluate the basic software download time and configuration time. Moreover we also propose architecture of SR-based multimode terminal that can reduce download time and utilize over-the-air software download services by VICS and DSRC links.
Design of a Field-Programmable Digital Filter Chip Using Multiple-Valued Current-Mode Logic
Katsuhiko DEGAWA Takafumi AOKI Tatsuo HIGUCHI

PAPER

Vol:
E86-A No:8
Page(s):
2001-2010
This paper presents a Field-Programmable Digital Filter (FPDF) IC that employs carry-propagation-free redundant arithmetic algorithms for faster computation and multiple-valued current-mode circuit technology for high-density low-power implementation. The original contribution of this paper is to evaluate, through actual chip fabrication, the potential impact of multiple-valued current-mode circuit technology on the reduction of hardware complexity required for DSP-oriented programmable ICs. The prototype FPDF fabrication with 0.6 µm CMOS technology demonstrates that the chip area and power consumption can be reduced to 41% and 71%, respectively, compared with the standard binary logic implementation.

241-260hit(329hit)

Keyword Search Result

[Keyword] FPGA(329hit)

Gram-Schmidt M-Wave Canceller for the EMG Controlled FES

A Novel FPGA Architecture and an Integrated Framework of CAD Tools for Implementing Applications

Hardware n Choose k Counters with Applications to the Partial Exhaustive Search

Header Extraction and Control for an Asynchronous Optical Packet Switch Based on DPSK Decoding

Hybrid Pattern BIST for Low-Cost Core Testing Using Embedded FPGA Core

SPFD-Based Flexible Transformation of LUT-Based FPGA Circuits

FPGAs with Multidimensional Switch Topology

FPGA-Based Reconfigurable Adaptive FEC

Field-Programmable VLSI Based on a Bit-Serial Fine-Grain Architecture

Implementation of FPGA Based Fast Unitary MUSIC DOA Estimator

The Design and Evaluation of Data-Dependent Hardware for Subgraph Isomorphism Problem

A Real-Time Image Compressor Using 2-Dimensional DWT and Its FPGA Implementation

Preliminary Evaluation of Flex Power FPGA: A Power Reconfigurable Architecture with Fine Granularity

Self-Reconfigurable Multi-Layer Neural Networks with Genetic Algorithms

An FPGA-Based Acceleration Method for Metabolic Simulation

An Acceleration Processor for Data Intensive Scientific Computing

FPGA Design of Real-Time Watermarking Processor for 2DDWT-Based Video Compression

An Equalization Technique for 54 Mbps OFDM Systems

Feasibility Study on Over-the-Air Software Download for Software-Radio-Based Intelligent Transport Systems

Design of a Field-Programmable Digital Filter Chip Using Multiple-Valued Current-Mode Logic

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles