IEICE global.ieice.org Site

Author Search Result

[Author] Heungseop AHN(1hit)

1-1hit

A Novel Procedure for Implementing a Turbo Decoder on a GPU with Coalesced Memory Access
Heungseop AHN Seungwon CHOI

PAPER-Communication Theory and Signals

Vol:
E100-A No:5
Page(s):
1188-1196
The sub-blocking algorithm has been known as a core component in implementing a turbo decoder using a Graphic Processing Unit (GPU) to use as many cores in the GPU as possible for parallel processing. However, even though the sub-blocking algorithm allows a large number of threads in a given GPU to be adopted for processing a large number of sub-blocks in parallel, each thread must access the global memory with strided addresses, which results in uncoalesced memory access. Because uncoalesced memory access causes a lot of unnecessary memory transactions, the memory bandwidth efficiency drops significantly, possibly as low as 1/8 in the case of an Long Term Evolution (LTE) turbo decoder, depending upon the compute capability of a GPU. In this paper, we present a novel method for converting uncoalesced memory access into coalesced access in a way that completely recovers the memory bandwidth efficiency to 100% without additional overhead. Our experimental tests, performed with NVIDIA's Geforce GTX 780 Ti GPU, show that the proposed method can enhance the throughput by nearly 30% compared with a conventional turbo decoder that suffers from uncoalesced memory access. Throughput provided by the proposed method has been observed to be 51.4Mbps when the number of iterations and that of sub-blocks are set to 6 and 32, respectively, in our experimental tests, which far exceeds the performance of previous works implemented the Max-Log-MAP algorithm.

Author Search Result

[Author] Heungseop AHN(1hit)

A Novel Procedure for Implementing a Turbo Decoder on a GPU with Coalesced Memory Access

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles