This paper proposes efficient DSP instructions and their hardware architecture for the Viterbi algorithm. The implementation of the Viterbi algorithm on a DSP chip has been attracting more interest for its flexibility, programmability, etc. The proposed architecture can reduce the Trace Back (TB) latency and can support various wireless communication standards. The proposed instructions perform the Add Compare Select (ACS) and TB operations in parallel and the architecture has special hardware, called the Offset Calculation Unit (OCU), which automatically calculates data addresses for acceleration of the trellis butterfly computations. When the constraint length K is 5, the proposed architecture can reduce the decoding cycles about 17% compared with Carmel DSP and about 45% compared with TMS320C55x.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Weon Heum PARK, Myung Hoon SUNWOO, Seong Keun OH, "Efficient DSP Architecture for Viterbi Decoding with Small Trace Back Latency" in IEICE TRANSACTIONS on Communications,
vol. E89-B, no. 10, pp. 2813-2818, October 2006, doi: 10.1093/ietcom/e89-b.10.2813.
Abstract: This paper proposes efficient DSP instructions and their hardware architecture for the Viterbi algorithm. The implementation of the Viterbi algorithm on a DSP chip has been attracting more interest for its flexibility, programmability, etc. The proposed architecture can reduce the Trace Back (TB) latency and can support various wireless communication standards. The proposed instructions perform the Add Compare Select (ACS) and TB operations in parallel and the architecture has special hardware, called the Offset Calculation Unit (OCU), which automatically calculates data addresses for acceleration of the trellis butterfly computations. When the constraint length K is 5, the proposed architecture can reduce the decoding cycles about 17% compared with Carmel DSP and about 45% compared with TMS320C55x.
URL: https://global.ieice.org/en_transactions/communications/10.1093/ietcom/e89-b.10.2813/_p
Copy
@ARTICLE{e89-b_10_2813,
author={Weon Heum PARK, Myung Hoon SUNWOO, Seong Keun OH, },
journal={IEICE TRANSACTIONS on Communications},
title={Efficient DSP Architecture for Viterbi Decoding with Small Trace Back Latency},
year={2006},
volume={E89-B},
number={10},
pages={2813-2818},
abstract={This paper proposes efficient DSP instructions and their hardware architecture for the Viterbi algorithm. The implementation of the Viterbi algorithm on a DSP chip has been attracting more interest for its flexibility, programmability, etc. The proposed architecture can reduce the Trace Back (TB) latency and can support various wireless communication standards. The proposed instructions perform the Add Compare Select (ACS) and TB operations in parallel and the architecture has special hardware, called the Offset Calculation Unit (OCU), which automatically calculates data addresses for acceleration of the trellis butterfly computations. When the constraint length K is 5, the proposed architecture can reduce the decoding cycles about 17% compared with Carmel DSP and about 45% compared with TMS320C55x.},
keywords={},
doi={10.1093/ietcom/e89-b.10.2813},
ISSN={1745-1345},
month={October},}
Copy
TY - JOUR
TI - Efficient DSP Architecture for Viterbi Decoding with Small Trace Back Latency
T2 - IEICE TRANSACTIONS on Communications
SP - 2813
EP - 2818
AU - Weon Heum PARK
AU - Myung Hoon SUNWOO
AU - Seong Keun OH
PY - 2006
DO - 10.1093/ietcom/e89-b.10.2813
JO - IEICE TRANSACTIONS on Communications
SN - 1745-1345
VL - E89-B
IS - 10
JA - IEICE TRANSACTIONS on Communications
Y1 - October 2006
AB - This paper proposes efficient DSP instructions and their hardware architecture for the Viterbi algorithm. The implementation of the Viterbi algorithm on a DSP chip has been attracting more interest for its flexibility, programmability, etc. The proposed architecture can reduce the Trace Back (TB) latency and can support various wireless communication standards. The proposed instructions perform the Add Compare Select (ACS) and TB operations in parallel and the architecture has special hardware, called the Offset Calculation Unit (OCU), which automatically calculates data addresses for acceleration of the trellis butterfly computations. When the constraint length K is 5, the proposed architecture can reduce the decoding cycles about 17% compared with Carmel DSP and about 45% compared with TMS320C55x.
ER -