IEICE global.ieice.org Site

Keyword Search Result

[Keyword] fpga(330hit)

201-220hit(330hit)

A Secure Content Delivery System Based on a Partially Reconfigurable FPGA
Yohei HORI Hiroyuki YOKOYAMA Hirofumi SAKANE Kenji TODA

PAPER-Contents Protection

Vol:
E91-D No:5
Page(s):
1398-1407
We developed a content delivery system using a partially reconfigurable FPGA to securely distribute digital content on the Internet. With partial reconfigurability of a Xilinx Virtex-II Pro FPGA, the system provides an innovative single-chip solution for protecting digital content. In the system, a partial circuit must be downloaded from a server to the client terminal to play content. Content will be played only when the downloaded circuit is correctly combined (=interlocked) with the circuit built in the terminal. Since each circuit has a unique I/O configuration, the downloaded circuit interlocks with the corresponding built-in circuit designed for a particular terminal. Thus, the interface of the circuit itself provides a novel authentication mechanism. This paper describes the detailed architecture of the system and clarify the feasibility and effectiveness of the system. In addition, we discuss a fail-safe mechanism and future work necessary for the practical application of the system.
Multi-Context FPGA Using Fine-Grained Interconnection Blocks and Its CAD Environment
Hasitha Muthumala WAIDYASOORIYA Weisheng CHONG Masanori HARIYAMA Michitaka KAMEYAMA

PAPER

Vol:
E91-C No:4
Page(s):
517-525
Dynamically-programmable gate arrays (DPGAs) promise lower-cost implementations than conventional field-programmable gate arrays (FPGAs) since they efficiently reuse limited hardware resources in time. One of the typical DPGA architectures is a multi-context FPGA (MC-FPGA) that requires multiple memory bits per configuration bit to realize fast context switching. However, this additional memory bits cause significant overhead in area and power consumption. This paper presents novel architecture of a switch element to overcome the required capacity of configuration memory. Our main idea is to exploit redundancy between different contexts by using a fine-grained switch element. The proposed MC-FPGA is designed in a 0.18 µm CMOS technology. Its maximum clock frequency and the context switching frequency are measured to be 310 MHz and 272 MHz, respectively. Moreover, novel CAD process that exploits the redundancy in configuration data, is proposed to support the MC-FPGA architecture.
Resource and Performance Evaluations of Fixed Point QRD-RLS Systolic Array through FPGA Implementation
Yoshiaki YOKOYAMA Minseok KIM Hiroyuki ARAI

PAPER-Wireless Communication Technologies

Vol:
E91-B No:4
Page(s):
1068-1075
At present, when using space-time processing techniques with multiple antennas for mobile radio communication, real-time weight adaptation is necessary. Due to the progress of integrated circuit technology, dedicated processor implementation with ASIC or FPGA can be employed to implement various wireless applications. This paper presents a resource and performance evaluation of the QRD-RLS systolic array processor based on fixed-point CORDIC algorithm with FPGA. In this paper, to save hardware resources, we propose the shared architecture of a complex CORDIC processor. The required precision of internal calculation, the circuit area for the number of antenna elements and wordlength, and the processing speed will be evaluated. The resource estimation provides a possible processor configuration with a current FPGA on the market. Computer simulations assuming a fading channel will show a fast convergence property with a finite number of training symbols. The proposed architecture has also been implemented and its operation was verified by beamforming evaluation through a radio propagation experiment.
Hardware Neural Network for a Visual Inspection System
Seungwoo CHUN Yoshihiro HAYAKAWA Koji NAKAJIMA

PAPER

Vol:
E91-A No:4
Page(s):
935-942
The visual inspection of defects in products is heavily dependent on human experience and instinct. In this situation, it is difficult to reduce the production costs and to shorten the inspection time and hence the total process time. Consequently people involved in this area desire an automatic inspection system. In this paper, we propose a hardware neural network, which is expected to provide high-speed operation for automatic inspection of products. Since neural networks can learn, this is a suitable method for self-adjustment of criteria for classification. To achieve high-speed operation, we use parallel and pipelining techniques. Furthermore, we use a piecewise linear function instead of a conventional activation function in order to save hardware resources. Consequently, our proposed hardware neural network achieved 6GCPS and 2GCUPS, which in our test sample proved to be sufficiently fast.
A Design of the Signal Processing Hardware Platform for Communication Systems
Byung Wook LEE Sung Ho CHO

LETTER-Wireless Communication Technologies

Vol:
E91-B No:3
Page(s):
939-942
In this letter, an efficient hardware platform for the digital signal processing for OFDM communication systems is presented. The hardware platform consists of a single FPGA having 900 K gates, two DSPs with maximum 8,000 MIPS at 1 GHz clock, 2-channel ADC and DAC supporting maximum 125 MHz sampling rate, and flexible data bus architecture, so that a wide variety of baseband signal processing algorithms for practical OFDM communication systems may be implemented and tested. The IEEE 802.16d software modem is also presented in order to verify the effectiveness and usefulness of the designed platform.
Multi-Channel Multi-Stage Transmultiplexing Digital Down Converter and Its Application to RFID (ISO18000-3 mode 2) Reader/Writer
Yuichi NAKAGAWA Kei SAKAGUCHI Hideki KAWAMURA Kyoji OHASHI Masahiro MURAGUCHI Kiyomichi ARAKI

PAPER-Enabling Technology

Vol:
E91-B No:1
Page(s):
139-146
Implementation of RFID reader/writer on software defined radio is studied in this paper. The target RFID is ISO18000-3 mode 2 which has 8 reply channels for simultaneous communication with 8 different RFID tags. In the software defined radio architecture, the 8 reply channels are sampled at a single A/D converter and separated by digital down converters, whereas conventional RFID architecture has redundant 8 parallel analog down converters. A novel multi-stage transmultiplexing digital down converter is proposed for efficient implementation of multi-channel digital down converter. Moreover the proposed architecture is implemented on a FPGA evaluation board, and validity of the system is confirmed on a real hardware. The proposed architecture can be applied to multi-channel receiver for dynamic spectrum system in the cognitive radio.
Implementation of Joint Pre-FFT Adaptive Array Antenna and Post-FFT Space Diversity Combining for Mobile ISDB-T Receiver
Dang Hai PHAM Jing GAO Takanobu TABATA Hirokazu ASATO Satoshi HORI Tomohisha WADA

PAPER-Enabling Technology

Vol:
E91-B No:1
Page(s):
127-138
In our application targeted here, four on-glass antenna elements are set in an automobile to improve the reception quality of mobile ISDB-T receiver. With regard to the directional characteristics of each antenna, we propose and implement a joint Pre-FFT adaptive array antenna and Post-FFT space diversity combining (AAA-SDC) scheme for mobile ISDB-T receiver. By applying a joint hardware and software approach, a flexible platform is realized in which several system configuration schemes can be supported; the receiver can be reconfigured on the fly. Simulation results show that the AAA-SDC scheme drastically improves the performance of mobile ISDB-T receiver, especially in the region of large Doppler shift. The experimental results from a field test also confirm that the proposed AAA-SDC scheme successfully achieves an outstanding reception rate up to 100% while moving at the speed of 80 km/h.
Diversification of Processors Based on Redundancy in Instruction Set
Shuichi ICHIKAWA Takashi SAWADA Hisashi HATA

PAPER-Implementation

Vol:
E91-A No:1
Page(s):
211-220
By diversifying processor architecture, computer software is expected to be more resistant to plagiarism, analysis, and attacks. This study presents a new method to diversify instruction set architecture (ISA) by utilizing the redundancy in the instruction set. Our method is particularly suited for embedded systems implemented with FPGA technology, and realizes a genuine instruction set randomization, which has not been provided by the preceding studies. The evaluation results on four typical ISAs indicate that our scheme can provide a far larger degree of freedom than the preceding studies. Diversified processors based on MIPS architecture were actually implemented and evaluated with Xilinx Spartan-3 FPGA. The increase of logic scale was modest: 5.1% in Specialized design and 3.6% in RAM-mapped design. The performance overhead was also modest: 3.4% in Specialized design and 11.6% in RAM-mapped design. From these results, our scheme is regarded as a practical and promising way to secure FPGA-based embedded systems.
A Self-Reconfigurable Adaptive FIR Filter System on Partial Reconfiguration Platform
Chang-Seok CHOI Hanho LEE

PAPER-Reconfigurable System and Applications

Vol:
E90-D No:12
Page(s):
1932-1938
This paper presents a self-reconfigurable adaptive FIR filter system design using dynamic partial reconfiguration, which has flexibility, power efficiency, advantages of configuration time allowing dynamically inserting or removing adaptive FIR filter modules. This self-reconfigurable adaptive FIR filter is responsible for providing the best solution for realization and autonomous adaptation of FIR filters, and processes the optimal digital signal processing algorithms, which are the low-pass, band-pass and high-pass filter algorithms with various frequencies, for noise removal operations. The proposed stand-alone self-reconfigurable system using Xilinx Virtex4 FPGA and Compact-Flash memory shows the improvement of configuration time and flexibility by using the dynamic partial reconfiguration techniques.
Efficient Memory Utilization for High-Speed FPGA-Based Hardware Emulators with SDRAMs
Kohei HOSOKAWA Katsunori TANAKA Yuichi NAKAMURA

PAPER-System Level Design

Vol:
E90-A No:12
Page(s):
2810-2817
FPGA-based hardware emulators are often used for the verification of LSI functions. They generally have dedicated external memories, such as SDRAMs, to compensate for the lack of memory capacity in FPGAs. In such a case, access between the FPGAs and the dedicated external memory may represent a major bottleneck with respect to emulation speed since the dedicated external memory may have to emulate a large number of memory blocks. In this paper, we propose three methods, "Dynamic Clock Control (DCC)," "Memory Mapping Optimization (MMO)," and "Efficient Access Scheduling (EAS)," to avoid this bottleneck. DCC controls an emulation clock dynamically in accord with the number of memory accesses within one emulation clock cycle. EAS optimizes the ordering of memory access to the dedicated external memory, and MMO optimizes the arrangement of the dedicated external memory addresses to which respective memories will be emulated. With them, emulation speed can be made 29.0 times faster, as evaluated in actual LSI emulations.
A Port Combination Methodology for Application-Specific Networks-on-Chip on FPGAs
Daihan WANG Hiroki MATSUTANI Michihiro KOIBUCHI Hideharu AMANO

PAPER-Reconfigurable System and Applications

Vol:
E90-D No:12
Page(s):
1914-1922
A temporal correlation based port combination algorithm that customizes the router design in Network-on-Chip (NoC) is proposed for reconfigurable systems in order to minimize required hardware amount. Given the traffic characteristics of the target application and the expected hardware amount reduction rate, the algorithm automatically makes the port combination plan for the networks. Since the port combination technique has the advantage of almost keeping the topology including two-surface layout, it does not affect the design of the other layer, such as task mapping and scheduling. The algorithm shows much better efficiency than the algorithm without temporal correlation. For the multimedia stream processing application, the algorithm can save 55% of the hardware amount without performance degradation, while the none temporal correlation algorithm suffers from 30% performance loss.
Optimization of the Body Bias Voltage Set (BBVS) for Flex Power FPGA
Takashi KAWANAMI Masakazu HIOKI Yohei MATSUMOTO Toshiyuki TSUTSUMI Tadashi NAKAGAWA Toshihiro SEKIGAWA Hanpei KOIKE

PAPER-Reconfigurable Device and Design Tools

Vol:
E90-D No:12
Page(s):
1947-1955
This paper describes a new design concept, the Body Bias Voltage Set (BBVS), and presents the effect of the BBVS on static power, operating speed, and area overhead in an FPGA with field-programmable Vth components. A Flex Power FPGA is an FPGA architecture to solve the static power problem by the fine grain field-programmable Vth control method. Since the Vth of transistors for specific circuit blocks in the Flex Power FPGA is chosen from a set of Vth values defined by a BBVS, selection of a particular BBVS is an important design decision. A particular BBVS is chosen by selecting body biases from among several supplied body bias candidates. To select the optimal BBVS, we provide 136 BBVSs and perform a thorough search. In a BBVS of less Vth steps, the deepest reverse body bias for high-Vth transistors does not necessarily result in optimal conditions. A BBVS of 0.0 V and -0.8 V, which requires 1.65 times the original area, utilizes as little as 1/30 of the static power of a conventional FPGA without performance degradation. Use of an aggressive forward body bias voltage such as +0.6 V for lowest-Vth, performance is increased by up to 10%. Another BBVS of +0.6 V, 0.0 V, and -0.8 V reduces static power to 14.06% while maintaining a 10% performance increase, but it requires 2.75-fold area.
FPGA-Based Intrusion Detection System for 10 Gigabit Ethernet
Toshihiro KATASHITA Yoshinori YAMAGUCHI Atusi MAEDA Kenji TODA

PAPER-Reconfigurable System and Applications

Vol:
E90-D No:12
Page(s):
1923-1931
The present paper describes an implementation of an intrusion detection system (IDS) on an FPGA for 10 Gigabit Ethernet. The system includes an exact string matching circuit for 1,225 Snort rules on a single device. A number of studies have examined string matching circuits for IDS. However, implementing a circuit that processes a large rule set at high throughput is difficult. In a previous study, we proposed a method for generating an NFA-based string matching circuit that has expandability of processing data width and drastically reduced resource requirements. In the present paper, we implement an IDS circuit that processes 1,225 Snort rules at 10 Gbps with a single Xilinx Virtex-II Pro xc2vp-100 using the NFA-based method. The proposed circuit also provides packet filtering for an intrusion protection system (IPS). In addition, we developed a tool for automatically generating the Verilog HDL source code of the IDS circuit from a Snort rule set. Using the FPGA and the IDS circuit generator, the proposed system is able to update the matching rules corresponding to new intrusions and attacks. We implemented the IDS circuit on an FPGA board and evaluated its accuracy and throughput. As a result, we confirmed in a test that the circuit detects attacks perfectly at the wire speed of 10 Gigabit Ethernet.
Multiple Sequence Alignment Based on Dynamic Programming Using FPGA
Shingo MASUNO Tsutomu MARUYAMA Yoshiki YAMAGUCHI Akihiko KONAGAYA

PAPER-Reconfigurable System and Applications

Vol:
E90-D No:12
Page(s):
1939-1946
Multiple sequence alignment problems in computational biology have been focused recently because of the rapid growth of sequence databases. By computing alignment, we can understand similarity among the sequences. Many hardware systems for alignment have been proposed to date, but most of them are designed for two-dimensional alignment (alignment between two sequences) because of the complexity to calculate alignment among more than two sequences under limited hardware resources. In this paper, we describe a compact system with an off-the-shelf FPGA board and a host computer for more than three-dimensional alignment based on dynamic programming. In our approach, high performance is achieved (1) by configuring optimal circuit for each dimensional alignment, and (2) by two phase search in each dimension by reconfiguration. In order to realize multidimensional search with a common architecture, two-dimensional dynamic programming is repeated along other dimensions. With this approach, we can minimize the size of units for alignment and achieve high parallelism. Our system with one XC2V6000 enables about 300-fold speedup as compared with single Intel Pentium4 2 GHz processor for four-dimensional alignment, and 100-fold speedup for five-dimensional alignment.
Basic Characteristics and Learning Potential of a Digital Spiking Neuron
Hiroyuki TORIKAI

PAPER-Neuron and Neural Networks

Vol:
E90-A No:10
Page(s):
2093-2100
The digital spiking neuron (DSN) consists of digital state cells and behaves like a simplified neuron model. By adjusting wirings among the cells, the DSN can generate spike-trains with various characteristics. In this paper we present a theorem that clarifies basic relations between change of wirings and change of characteristics of the spike-train. Also, in order to explore learning potential of the DSN, we propose a learning algorithm for generating spike-trains that are suited to an application example. We then show significances and basic roles of the presented theorem in the learning dynamics.
A 90 nm 4848 LUT-Based FPGA Enhancing Speed and Yield Utilizing Within-Die Delay Variations
Kazutoshi KOBAYASHI Kazuya KATSUKI Manabu KOTANI Yuuri SUGIHARA Yohei KUME Hidetoshi ONODERA

PAPER-Low-Power and High-Performance VLSI Circuit Technology

Vol:
E90-C No:10
Page(s):
1919-1926
We have fabricated a LUT-based FPGA device with functionalities measuring within-die variations in a 90 nm process. Variations are measured using ring oscillators implemented as a configuration of the FPGA. Random variations are dominant in a 4848 configurable array laid out in a 3 mm3 mm square region. It has a functionality to measure delays on actual signal paths between flip flops by providing two clock pulses. Measured variations are used to maximize the operating frequency of each device by choosing the optimal paths. Optimizations of routing paths using a simple model circuit reveals that performance of the circuit is enhanced by 2.88% in average and a maximum of 9.34%.
Wide View Imaging System Using Eight Random Access Image Sensors
Kenji IDE Ryusuke KAWAHARA Satoshi SHIMIZU Takayuki HAMAMOTO

PAPER-Image Sensor/Vision Chip

Vol:
E90-C No:10
Page(s):
1884-1891
We have investigated real-time object tracking using a wide view imaging system. For the system, we have designed and fabricated new smart image sensor with four functions effective in wide view imaging, such as a random access function. In this system, eight smart sensors and an octagonal mirror are used and each image obtained by the sensors is equivalent to a partial image of the wide view. In addition, by using an FPGA for processing, the circuits in this system can be scaled down and a panoramic image can be obtained in real time. For object tracking using this system, the object-detection method based on background subtraction is used. When moving objects are detected in the panoramic image, the objects are constantly displayed on the monitor at higher resolution in real time. In this paper, we describe the random access image sensor and show some results obtained using this sensor. In addition, we describe the wide view imaging system using eight sensors. Furthermore, we explain the method of object tracking in this system and show the results of real-time multipl-object tracking.
Efficient Motion Estimation for H.264 Codec by Using Effective Scan Ordering
Jeongae PARK Misun YOON Hyunchul SHIN

LETTER-Devices/Circuits for Communications

Vol:
E90-B No:7
Page(s):
1839-1843
Motion estimation (ME) is a computation intensive procedure in H.264. In ME for variable block sizes, an effective scan ordering method has been devised for early termination of absolute difference computation when the termination does not affect the performance. The new ME circuit with effective scan ordering can reduce the amount of computation by 70% compared to JM8.2 and by 30% compared to the disable approximation unit (DAU) approach.
Design Methods of Radix Converters Using Arithmetic Decompositions
Yukihiro IGUCHI Tsutomu SASAO Munehiro MATSUURA

PAPER-Computer Components

Vol:
E90-D No:6
Page(s):
905-914
In arithmetic circuits for digital signal processing, radixes other than two are often used to make circuits faster. In such cases, radix converters are necessary. However, in general, radix converters tend to be complex. This paper considers design methods for p-nary to binary converters. First, it considers Look-Up Table (LUT) cascade realizations. Then, it introduces a new design technique called arithmetic decomposition by using LUTs and adders. Finally, it compares the amount of hardware and performance of radix converters implemented by FPGAs. 12-digit ternary to binary converters on Cyclone II FPGAs designed by the proposed method are faster than ones by conventional methods.
A Performance-Driven Circuit Bipartitioning Method Considering Time-Multiplexed I/Os
Masato INAGI Yasuhiro TAKASHIMA Yuichi NAKAMURA Yoji KAJITANI

PAPER

Vol:
E90-A No:5
Page(s):
924-931
Lately, time-multiplexed I/Os for multi-device implementations (e.g., multi-FPGA systems), have come into practical use. They realize multiple I/O signal transmissions between two devices in one system clock cycle using one I/O wire between the devices and multiple I/O clock cycles. Though they ease the limitation of the number of I/O-pins of each device, the system clock period becomes much longer approximately in proprotion to the maximum number of multiplexed I/Os on a signal path. There is no conventional partitioning algorithm considering the effect of time-multiplexed I/Os directly. We introduce a new cost function for evaluating the suitability of a bipartition for multi-device implementations with time-multiplexed I/Os. We propose a performance-driven bipartitioning method VIOP which minimizes the value of the cost function. Our method VIOP combines three algorithms, such that i) min-cut partitioning, ii) coarse performance-driven partitioning, iii) fine performance-driven partitioning. For min-cut partitioning and coarse performance-driven partitioning, we employ a well-known conventional bipartitioning algorithms CLIP-FM and DUBA, respectively. For fine performance-driven partitioning for the final improvement of a partition, we propose a partitioning algorithm CAVP. By our method VIOP, the average cost was improved by 10.4% compared with the well-known algorithms.

201-220hit(330hit)

Keyword Search Result

[Keyword] fpga(330hit)

A Secure Content Delivery System Based on a Partially Reconfigurable FPGA

Multi-Context FPGA Using Fine-Grained Interconnection Blocks and Its CAD Environment

Resource and Performance Evaluations of Fixed Point QRD-RLS Systolic Array through FPGA Implementation

Hardware Neural Network for a Visual Inspection System

A Design of the Signal Processing Hardware Platform for Communication Systems

Multi-Channel Multi-Stage Transmultiplexing Digital Down Converter and Its Application to RFID (ISO18000-3 mode 2) Reader/Writer

Implementation of Joint Pre-FFT Adaptive Array Antenna and Post-FFT Space Diversity Combining for Mobile ISDB-T Receiver

Diversification of Processors Based on Redundancy in Instruction Set

A Self-Reconfigurable Adaptive FIR Filter System on Partial Reconfiguration Platform

Efficient Memory Utilization for High-Speed FPGA-Based Hardware Emulators with SDRAMs

A Port Combination Methodology for Application-Specific Networks-on-Chip on FPGAs

Optimization of the Body Bias Voltage Set (BBVS) for Flex Power FPGA

FPGA-Based Intrusion Detection System for 10 Gigabit Ethernet

Multiple Sequence Alignment Based on Dynamic Programming Using FPGA

Basic Characteristics and Learning Potential of a Digital Spiking Neuron

A 90 nm 4848 LUT-Based FPGA Enhancing Speed and Yield Utilizing Within-Die Delay Variations

Wide View Imaging System Using Eight Random Access Image Sensors

Efficient Motion Estimation for H.264 Codec by Using Effective Scan Ordering

Design Methods of Radix Converters Using Arithmetic Decompositions

A Performance-Driven Circuit Bipartitioning Method Considering Time-Multiplexed I/Os

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles