The parallel interference cancellation (PIC) multiple input multiple output (MIMO) detection algorithm has bit error ratio (BER) performance comparable to the maximum likelihood (ML) algorithm but with complexity close to the simple linear detection algorithm such as zero forcing (ZF), minimum mean squared error (MMSE), and successive interference cancellation (SIC), etc. However, the throughput of PIC MIMO detector on central processing unit (CPU) cannot meet the requirement of wireless protocols. In order to reach the throughput required by the standards, the graphics processing unit (GPU) is exploited in this paper as the modem processor to accelerate the processing procedure of PIC MIMO detector. The parallelism of PIC algorithm is analyzed and the two-stage PIC detection is carefully developed to efficiently match the multi-core architecture. Several optimization methods are employed to enhance the throughput, such as the memory optimization and asynchronous data transfer. The experiment shows that our MIMO detector has excellent BER performance and the peak throughput is 337.84 Mega bits per second (Mbps), about 7x to 16x faster than that of CPU implementation with SSE2 optimization methods. The implemented MIMO detector has better computing throughput than recent GPU-based implementations.
Rongchun LI
National University of Defense Technology
Yong DOU
National University of Defense Technology
Jie ZHOU
National University of Defense Technology
Chen CHEN
National University of Defense Technology
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Rongchun LI, Yong DOU, Jie ZHOU, Chen CHEN, "Efficient Parallel Interference Cancellation MIMO Detector for Software Defined Radio on GPUs" in IEICE TRANSACTIONS on Fundamentals,
vol. E97-A, no. 6, pp. 1388-1395, June 2014, doi: 10.1587/transfun.E97.A.1388.
Abstract: The parallel interference cancellation (PIC) multiple input multiple output (MIMO) detection algorithm has bit error ratio (BER) performance comparable to the maximum likelihood (ML) algorithm but with complexity close to the simple linear detection algorithm such as zero forcing (ZF), minimum mean squared error (MMSE), and successive interference cancellation (SIC), etc. However, the throughput of PIC MIMO detector on central processing unit (CPU) cannot meet the requirement of wireless protocols. In order to reach the throughput required by the standards, the graphics processing unit (GPU) is exploited in this paper as the modem processor to accelerate the processing procedure of PIC MIMO detector. The parallelism of PIC algorithm is analyzed and the two-stage PIC detection is carefully developed to efficiently match the multi-core architecture. Several optimization methods are employed to enhance the throughput, such as the memory optimization and asynchronous data transfer. The experiment shows that our MIMO detector has excellent BER performance and the peak throughput is 337.84 Mega bits per second (Mbps), about 7x to 16x faster than that of CPU implementation with SSE2 optimization methods. The implemented MIMO detector has better computing throughput than recent GPU-based implementations.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/transfun.E97.A.1388/_p
Copy
@ARTICLE{e97-a_6_1388,
author={Rongchun LI, Yong DOU, Jie ZHOU, Chen CHEN, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Efficient Parallel Interference Cancellation MIMO Detector for Software Defined Radio on GPUs},
year={2014},
volume={E97-A},
number={6},
pages={1388-1395},
abstract={The parallel interference cancellation (PIC) multiple input multiple output (MIMO) detection algorithm has bit error ratio (BER) performance comparable to the maximum likelihood (ML) algorithm but with complexity close to the simple linear detection algorithm such as zero forcing (ZF), minimum mean squared error (MMSE), and successive interference cancellation (SIC), etc. However, the throughput of PIC MIMO detector on central processing unit (CPU) cannot meet the requirement of wireless protocols. In order to reach the throughput required by the standards, the graphics processing unit (GPU) is exploited in this paper as the modem processor to accelerate the processing procedure of PIC MIMO detector. The parallelism of PIC algorithm is analyzed and the two-stage PIC detection is carefully developed to efficiently match the multi-core architecture. Several optimization methods are employed to enhance the throughput, such as the memory optimization and asynchronous data transfer. The experiment shows that our MIMO detector has excellent BER performance and the peak throughput is 337.84 Mega bits per second (Mbps), about 7x to 16x faster than that of CPU implementation with SSE2 optimization methods. The implemented MIMO detector has better computing throughput than recent GPU-based implementations.},
keywords={},
doi={10.1587/transfun.E97.A.1388},
ISSN={1745-1337},
month={June},}
Copy
TY - JOUR
TI - Efficient Parallel Interference Cancellation MIMO Detector for Software Defined Radio on GPUs
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 1388
EP - 1395
AU - Rongchun LI
AU - Yong DOU
AU - Jie ZHOU
AU - Chen CHEN
PY - 2014
DO - 10.1587/transfun.E97.A.1388
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E97-A
IS - 6
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - June 2014
AB - The parallel interference cancellation (PIC) multiple input multiple output (MIMO) detection algorithm has bit error ratio (BER) performance comparable to the maximum likelihood (ML) algorithm but with complexity close to the simple linear detection algorithm such as zero forcing (ZF), minimum mean squared error (MMSE), and successive interference cancellation (SIC), etc. However, the throughput of PIC MIMO detector on central processing unit (CPU) cannot meet the requirement of wireless protocols. In order to reach the throughput required by the standards, the graphics processing unit (GPU) is exploited in this paper as the modem processor to accelerate the processing procedure of PIC MIMO detector. The parallelism of PIC algorithm is analyzed and the two-stage PIC detection is carefully developed to efficiently match the multi-core architecture. Several optimization methods are employed to enhance the throughput, such as the memory optimization and asynchronous data transfer. The experiment shows that our MIMO detector has excellent BER performance and the peak throughput is 337.84 Mega bits per second (Mbps), about 7x to 16x faster than that of CPU implementation with SSE2 optimization methods. The implemented MIMO detector has better computing throughput than recent GPU-based implementations.
ER -