IEICE global.ieice.org Site

Keyword Search Result

[Keyword] FPGA-based(7hit)

1-7hit

Approximate FPGA-Based Multipliers Using Carry-Inexact Elementary Modules
Yi GUO Heming SUN Ping LEI Shinji KIMURA

PAPER

Vol:
E103-A No:9
Page(s):
1054-1062
Approximate multiplier design is an effective technique to improve hardware performance at the cost of accuracy loss. The current approximate multipliers are mostly ASIC-based and are dedicated for one particular application. In contrast, FPGA has been an attractive choice for many applications because of its high performance, reconfigurability, and fast development round. This paper presents a novel methodology for designing approximate multipliers by employing the FPGA-based fabrics (primarily look-up tables and carry chains). The area and latency are significantly reduced by applying approximation on carry results and cutting the carry propagation path in the multiplier. Moreover, we explore higher-order multipliers on architectural space by using our proposed small-size approximate multipliers as elementary modules. For different accuracy-hardware requirements, eight configurations for approximate 8×8 multiplier are discussed. In terms of mean relative error distance (MRED), the error of the proposed 8×8 multiplier is as low as 1.06%. Compared with the exact multiplier, our proposed design can reduce area by 43.66% and power by 24.24%. The critical path latency reduction is up to 29.50%. The proposed multiplier design has a better accuracy-hardware tradeoff than other designs with comparable accuracy. Moreover, image sharpening processing is used to assess the efficiency of approximate multipliers on application.
A 3Gbps/Lane MIPI D-PHY Transmission Buffer Chip
Pil-Ho LEE Young-Chan JANG

LETTER

Vol:
E102-A No:6
Page(s):
783-787
A 3Gbps/lane transmission buffer chip including a high-speed mode detector is proposed for a field-programmable gate array (FPGA)-based frame generator supporting the mobile industry processor interface (MIPI) D-PHY version 1.2. It performs 1-to-3 repeat while buffering low voltage differential signaling (LVDS) or scalable low voltage signaling (SLVS) to SLVS.
A 10 Gbps D-PHY Transmitter Bridge Chip for FPGA-Based Frame Generator Supporting MIPI DSI of Mobile Display
Ho-Seong KIM Pil-Ho LEE Jin-Wook HAN Seung-Hun SHIN Seung-Wuk BAEK Doo-Ill PARK Yongkyu SEO Young-Chan JANG

BRIEF PAPER

Vol:
E100-C No:11
Page(s):
1035-1038
A 10 Gbps transmitter bridge chip including four data lanes, which increases the bandwidth using an 8-to-1 serialization, is proposed for a field-programmable gate array (FPGA)-based frame generator to support the protocol of the D-PHY version 1.2 for the mobile industry processor interface (MIPI) display serial interface (DSI).
Efficient Memory Utilization for High-Speed FPGA-Based Hardware Emulators with SDRAMs
Kohei HOSOKAWA Katsunori TANAKA Yuichi NAKAMURA

PAPER-System Level Design

Vol:
E90-A No:12
Page(s):
2810-2817
FPGA-based hardware emulators are often used for the verification of LSI functions. They generally have dedicated external memories, such as SDRAMs, to compensate for the lack of memory capacity in FPGAs. In such a case, access between the FPGAs and the dedicated external memory may represent a major bottleneck with respect to emulation speed since the dedicated external memory may have to emulate a large number of memory blocks. In this paper, we propose three methods, "Dynamic Clock Control (DCC)," "Memory Mapping Optimization (MMO)," and "Efficient Access Scheduling (EAS)," to avoid this bottleneck. DCC controls an emulation clock dynamically in accord with the number of memory accesses within one emulation clock cycle. EAS optimizes the ordering of memory access to the dedicated external memory, and MMO optimizes the arrangement of the dedicated external memory addresses to which respective memories will be emulated. With them, emulation speed can be made 29.0 times faster, as evaluated in actual LSI emulations.
Hardware n Choose k Counters with Applications to the Partial Exhaustive Search
Koji NAKANO Youhei YAMAGISHI

PAPER-Programmable Logic, VLSI, CAD and Layout

Vol:
E88-D No:7
Page(s):
1350-1359
The main contribution of this work is to present several hardware implementations of an "n choose k" counter (C(n,k) counter for short), which lists all n-bit numbers with (n-k) 0's and k 1's, and to show their applications. We first present concepts of C(n,k) counters and their efficient implementations on an FPGA. We then go on to evaluate their performance in terms of the number of used slices and the clock frequency for the Xilinx VirtexII family FPGA XC2V3000-4. As one of the real life applications, we use a C(n,k) counter to accelerate a digital halftoning method that generates a binary image reproducing an original gray-scale image. This method repeatedly replaces an image pattern in small square regions of a binary image by the best one. By the partial exhaustive search using a C(n,k) counter we succeeded in accelerating the task of finding the best image pattern and achieved a speedup factor of more than 2.5 over the simple exhaustive search.
An Image Retrieval System Using FPGAs
Koji NAKANO Etsuko TAKAMICHI

PAPER

Vol:
E86-D No:5
Page(s):
811-818
The main contribution of this paper is to present an image retrieval system using FPGAs. Given a template image T and a database of a number of images I1, I2,, our system lists all images that contain a subimage similar to T. More specifically, a hardware generator in our system creates the Verilog HDL source of a hardware that determines whether Ii has a similar subimage to T for any image Ii and a particular template T. The created Verilog HDL source is compiled and embedded in an FPGA using the design tool provided by the FPGA vendor. Since the hardware embedded in the FPGA is designed for a particular template T, it is an instance-specific hardware that allows us to achieve extreme acceleration. We evaluate the performance of our image matching hardware using a PCI-connected Xilinx FPGA and a timing analyzer. Since the generated hardware attains up to 3000 speed-up factor over the software solution, our approach is promising.
An FPGA-Oriented Motion-Stereo Processor with a Simple Interconnection Network for Parallel Memory Access
Seunghwan LEE Masanori HARIYAMA Michitaka KAMEYAMA

PAPER-Image Processing, Image Pattern Recognition

Vol:
E83-D No:12
Page(s):
2122-2130
In designing a field-programmable gate array (FPGA)-based processor for motion stereo, a parallel memory system and a simple interconnection network for parallel data transfer are essential for parallel image processing. This paper, firstly, presents an FPGA-oriented hierarchical memory system. To reduce the bandwidth requirement between an on-chip memory in an FPGA and external memories, we propose an efficient scheduling: Once pixels are transferred to the on-chip memory, operations associated with the data are consecutively performed. Secondly, a rectangular memory allocation is proposed which allocates pixels to be accessed in parallel onto different memory modules of the on-chip memory. Consequently, completely parallel access can be achieved. The memory allocation also minimizes the required capacity of the on-chip memory and thus is suitable for FPGA-based implementation. Finally, a functional unit allocation is proposed to minimize the complexity between memory modules and functional units. An experimental result shows that the performance of the processor becomes 96 times higher than that of a 400 MHz Pentium II.