In this letter, a low latency, high throughput and hardware efficient sorted MMSE QR decomposition (MMSE-SQRD) for multiple-input multiple-output (MIMO) systems is presented. In contrast to the method of extending the complex matrix to real model and thereafter applying real-valued QR decomposition (QRD), we develop a highly parallel decomposition scheme based on coordinate rotation digital computer (CORDIC) which performs the QRD in complex domain directly and then converting the complex result to its real counterpart. The proposed scheme can greatly improve the processing parallelism and curtail the nullification and sorting procedures. Besides, we also design the corresponding pipelined hardware architecture of the MMSE-SQRD based on highly parallel Givens rotation structure with CORDIC algorithm for 4×4 MIMO detectors. The proposed MMSE-SQRD is implemented in SMIC 55nm CMOS technology achieving up to 50M QRD/s throughput and a latency of 59 clock cycles with only 218 kilo-gates (KG). Compared to the previous works, the proposed design achieves the highest normalized throughput efficiency and lowest processing latency.
Lu SUN
Institute of Microelectronics of the Chinese Academy of Sciences (IMECAS),University of Chinese Academy of Sciences (UCAS)
Bin WU
Institute of Microelectronics of the Chinese Academy of Sciences (IMECAS)
Tianchun YE
Institute of Microelectronics of the Chinese Academy of Sciences (IMECAS)
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Lu SUN, Bin WU, Tianchun YE, "Design and VLSI Implementation of a Sorted MMSE QR Decomposition for 4×4 MIMO Detectors" in IEICE TRANSACTIONS on Fundamentals,
vol. E104-A, no. 4, pp. 762-767, April 2021, doi: 10.1587/transfun.2020EAL2076.
Abstract: In this letter, a low latency, high throughput and hardware efficient sorted MMSE QR decomposition (MMSE-SQRD) for multiple-input multiple-output (MIMO) systems is presented. In contrast to the method of extending the complex matrix to real model and thereafter applying real-valued QR decomposition (QRD), we develop a highly parallel decomposition scheme based on coordinate rotation digital computer (CORDIC) which performs the QRD in complex domain directly and then converting the complex result to its real counterpart. The proposed scheme can greatly improve the processing parallelism and curtail the nullification and sorting procedures. Besides, we also design the corresponding pipelined hardware architecture of the MMSE-SQRD based on highly parallel Givens rotation structure with CORDIC algorithm for 4×4 MIMO detectors. The proposed MMSE-SQRD is implemented in SMIC 55nm CMOS technology achieving up to 50M QRD/s throughput and a latency of 59 clock cycles with only 218 kilo-gates (KG). Compared to the previous works, the proposed design achieves the highest normalized throughput efficiency and lowest processing latency.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/transfun.2020EAL2076/_p
Copy
@ARTICLE{e104-a_4_762,
author={Lu SUN, Bin WU, Tianchun YE, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Design and VLSI Implementation of a Sorted MMSE QR Decomposition for 4×4 MIMO Detectors},
year={2021},
volume={E104-A},
number={4},
pages={762-767},
abstract={In this letter, a low latency, high throughput and hardware efficient sorted MMSE QR decomposition (MMSE-SQRD) for multiple-input multiple-output (MIMO) systems is presented. In contrast to the method of extending the complex matrix to real model and thereafter applying real-valued QR decomposition (QRD), we develop a highly parallel decomposition scheme based on coordinate rotation digital computer (CORDIC) which performs the QRD in complex domain directly and then converting the complex result to its real counterpart. The proposed scheme can greatly improve the processing parallelism and curtail the nullification and sorting procedures. Besides, we also design the corresponding pipelined hardware architecture of the MMSE-SQRD based on highly parallel Givens rotation structure with CORDIC algorithm for 4×4 MIMO detectors. The proposed MMSE-SQRD is implemented in SMIC 55nm CMOS technology achieving up to 50M QRD/s throughput and a latency of 59 clock cycles with only 218 kilo-gates (KG). Compared to the previous works, the proposed design achieves the highest normalized throughput efficiency and lowest processing latency.},
keywords={},
doi={10.1587/transfun.2020EAL2076},
ISSN={1745-1337},
month={April},}
Copy
TY - JOUR
TI - Design and VLSI Implementation of a Sorted MMSE QR Decomposition for 4×4 MIMO Detectors
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 762
EP - 767
AU - Lu SUN
AU - Bin WU
AU - Tianchun YE
PY - 2021
DO - 10.1587/transfun.2020EAL2076
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E104-A
IS - 4
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - April 2021
AB - In this letter, a low latency, high throughput and hardware efficient sorted MMSE QR decomposition (MMSE-SQRD) for multiple-input multiple-output (MIMO) systems is presented. In contrast to the method of extending the complex matrix to real model and thereafter applying real-valued QR decomposition (QRD), we develop a highly parallel decomposition scheme based on coordinate rotation digital computer (CORDIC) which performs the QRD in complex domain directly and then converting the complex result to its real counterpart. The proposed scheme can greatly improve the processing parallelism and curtail the nullification and sorting procedures. Besides, we also design the corresponding pipelined hardware architecture of the MMSE-SQRD based on highly parallel Givens rotation structure with CORDIC algorithm for 4×4 MIMO detectors. The proposed MMSE-SQRD is implemented in SMIC 55nm CMOS technology achieving up to 50M QRD/s throughput and a latency of 59 clock cycles with only 218 kilo-gates (KG). Compared to the previous works, the proposed design achieves the highest normalized throughput efficiency and lowest processing latency.
ER -