IEICE global.ieice.org Site

Keyword Search Result

[Keyword] VLSI architectures(5hit)

1-5hit

Design and Implementation of a Low-Complexity Reed-Solomon Decoder for Optical Communication Systems
Ming-Der SHIEH Yung-Kuei LU

PAPER-Computer System

Vol:
E94-D No:8
Page(s):
1557-1564
A low-complexity Reed-Solomon (RS) decoder design based on the modified Euclidean (ME) algorithm proposed by Truong is presented in this paper. Low complexity is achieved by reformulating Truong's ME algorithm using the proposed polynomial manipulation scheme so that a more compact polynomial representation can be derived. Together with the developed folding scheme and simplified boundary cell, the resulting design effectively reduces the hardware complexity while meeting the throughput requirements of optical communication systems. Experimental results demonstrate that the developed RS(255, 239) decoder, implemented in the TSMC 0.18 µm process, can operate at up to 425 MHz and achieve a throughput rate of 3.4 Gbps with a total gate count of 11,759. Compared to related works, the proposed decoder has the lowest area requirement and the smallest area-time complexity.
High-Speed Low-Complexity Architecture for Reed-Solomon Decoders
Yung-Kuei LU Ming-Der SHIEH

PAPER-Computer System

Vol:
E93-D No:7
Page(s):
1824-1831
This paper presents a high-speed, low-complexity VLSI architecture based on the modified Euclidean (ME) algorithm for Reed-Solomon decoders. The low-complexity feature of the proposed architecture is obtained by reformulating the error locator and error evaluator polynomials to remove redundant information in the ME algorithm proposed by Truong. This increases the hardware utilization of the processing elements used to solve the key equation and reduces hardware by 30.4%. The proposed architecture retains the high-speed feature of Truong's ME algorithm with a reduced latency, achieved by changing the initial settings of the design. Analytical results show that the proposed architecture has the smallest critical path delay, latency, and area-time complexity in comparison with similar studies. An example RS(255,239) decoder design, implemented using the TSMC 0.18 µm process, can reach a throughput rate of 3 Gbps at an operating frequency of 375 MHz and with a total gate count of 27,271.
VLSI Design of a Fully-Parallel High-Throughput Decoder for Turbo Gallager Codes
Luca FANUCCI Pasquale CIAO Giulio COLAVOLPE

PAPER-Digital Signal Processing

Vol:
E89-A No:7
Page(s):
1976-1986
The most powerful channel coding schemes, namely those based on turbo codes and low-density parity-check (LDPC) Gallager codes, have in common the principle of iterative decoding. However, the relative coding structures and decoding algorithms are substantially different. This paper presents a 2048-bit, rate-1/2 soft decision decoder for a new class of codes known as Turbo Gallager Codes. These codes are turbo codes with properly chosen component convolutional codes such that they can be successfully decoded by means of the decoding algorithm used for LDPC codes, i.e., the belief propagation algorithm working on the code Tanner graph. These coding schemes are important in practical terms for two reasons: (i) they can be encoded as classical turbo codes, giving a solution to the encoding problem of LDPC codes; (ii) they can also be decoded in a fully parallel manner, partially overcoming the routing congestion bottleneck of parallel decoder VLSI implementations thanks to the locality of the interconnections. The implemented decoder can support up to 1 Gbit/s data rate and performs up to 48 decoding iterations ensuring both high throughput and good coding gain. In order to evaluate the performance and the gate complexity of the decoder VLSI architecture, it has been synthesized in a 0.18 µm standard-cell CMOS technology.
Self-Adaptive Algorithmic/Architectural Design for Real-Time, Low-Power Video Systems
Luca FANUCCI Sergio SAPONARA Massimiliano MELANI Pierangelo TERRENI

PAPER-Adaptive Signal Processing

Vol:
E88-D No:7
Page(s):
1538-1545
With reference to video motion estimation in the framework of the new H.264/AVC video coding standard, this paper presents algorithmic and architectural solutions for the implementation of context-aware coprocessors in real-time, low-power embedded systems. A low-complexity context-aware controller is added to a conventional Full Search (FS) motion estimation engine. While the FS coprocessor is working, the context-aware controller extracts from the intermediate processing results information related to the input signal statistics in order to automatically configure the coprocessor itself in terms of search area size and number of reference frames; thus unnecessary computations and memory accesses can be avoided. The achieved complexity saving factor ranges from 2.2 to 25 depending on the input signal while keeping unaltered performance in terms of motion estimation accuracy. The increased efficiency is exploited both for (i) processing time reduction in case of software implementation on a programmable platform; (ii) power consumption reduction in case of dedicated hardware implementation in CMOS technology.
Design of Array Processors for 2-D Discrete Fourier Transform
Shietung PENG Igor SEDUKHIN Stanislav SEDUKHIN

PAPER

Vol:
E80-D No:4
Page(s):
455-465
In this paper the design of systolic array processors for computing 2-dimensional Discrete Fourier Transform (2-D DFT) is considered. We investigated three different computational schemes for designing systolic array processors using systematic approach. The systematic approach guarantees to find optimal systolic array processors from a large solution space in terms of the number of processing elements and I/O channels, the processing time, topology, pipeline period, etc. The optimal systolic array processors are scalable, modular and suitable for VLSI implementation. An application of the designed systolic array processors to the prime-factor DFT is also presented.