MSD-First On-Line Arithmetic Progressive Processing Implementation for Motion Estimation

Ching-Long SU; Chein-Wei JEN

MSD-First On-Line Arithmetic Progressive Processing Implementation for Motion Estimation

Ching-Long SU, Chein-Wei JEN

Full Text Views

0

Cite this

Summary :

This paper presents a novel digit-level algorithm for motion estimation (ME) and its hardware implementations. It uses the most-significant-digit-first (MSD-first) processing and on-line arithmetic ME components. A dedicated array architecture is also proposed for applications with high-throughput ME. Various fast search algorithms were presented in literatures to reduce the complexity but sacrifice the motion vector (MV) quality. Our MSD-first ME decomposes the summation of absolute differences (SAD) and comparison operations to digit level with MSD-plane first. These comparisons are interleaved into SADs to distinguish the MV as soon as possible. The algorithm precisely extracts the impossible candidates and removes their rest operations. It saves 47.4 % to 64.3 % of SAD computations in full search block matching (FSBM) ME. In the past, the high implementation cost of redundant number system prevented the practical use of on-line arithmetic. Besides, the redundant SAD removal results in irregular data flow in system-level integration. All these problems are solved by our novel architecture design. In this paper, we propose novel architecture designs to solve these problems. Besides, the architecture requires only one memory access per pixel to lower memory bandwidth by extensive data parallelism and a particular memory addressing while keeping the controller simple. A 4 4 array processor is implemented in 0.35 µm 1P4M CMOS cell library, with 2.84 ns cycle time and 1510 gates. It can support 83 M FSBM operations per second. After normalization, our implementation can support 2.67 times SAD operations per unit area (estimated in gate count) of the conventional two's complement ones. MSD-first ME can realize with other ME algorithms to improve the performance as well.

Publication: IEICE TRANSACTIONS on Information Vol.E86-D No.11 pp.2433-2443

Publication Date: 2003/11/01

Publicized

Online ISSN

DOI

Type of Manuscript: PAPER

Category: Image Processing, Image Pattern Recognition

Cite this

Copy

Ching-Long SU, Chein-Wei JEN, "MSD-First On-Line Arithmetic Progressive Processing Implementation for Motion Estimation" in IEICE TRANSACTIONS on Information, vol. E86-D, no. 11, pp. 2433-2443, November 2003, doi: .
Abstract: This paper presents a novel digit-level algorithm for motion estimation (ME) and its hardware implementations. It uses the most-significant-digit-first (MSD-first) processing and on-line arithmetic ME components. A dedicated array architecture is also proposed for applications with high-throughput ME. Various fast search algorithms were presented in literatures to reduce the complexity but sacrifice the motion vector (MV) quality. Our MSD-first ME decomposes the summation of absolute differences (SAD) and comparison operations to digit level with MSD-plane first. These comparisons are interleaved into SADs to distinguish the MV as soon as possible. The algorithm precisely extracts the impossible candidates and removes their rest operations. It saves 47.4 % to 64.3 % of SAD computations in full search block matching (FSBM) ME. In the past, the high implementation cost of redundant number system prevented the practical use of on-line arithmetic. Besides, the redundant SAD removal results in irregular data flow in system-level integration. All these problems are solved by our novel architecture design. In this paper, we propose novel architecture designs to solve these problems. Besides, the architecture requires only one memory access per pixel to lower memory bandwidth by extensive data parallelism and a particular memory addressing while keeping the controller simple. A 4 4 array processor is implemented in 0.35 µm 1P4M CMOS cell library, with 2.84 ns cycle time and 1510 gates. It can support 83 M FSBM operations per second. After normalization, our implementation can support 2.67 times SAD operations per unit area (estimated in gate count) of the conventional two's complement ones. MSD-first ME can realize with other ME algorithms to improve the performance as well.
URL: https://global.ieice.org/en_transactions/information/10.1587/e86-d_11_2433/_p

Copy

@ARTICLE{e86-d_11_2433,
author={Ching-Long SU, Chein-Wei JEN, },
journal={IEICE TRANSACTIONS on Information},
title={MSD-First On-Line Arithmetic Progressive Processing Implementation for Motion Estimation},
year={2003},
volume={E86-D},
number={11},
pages={2433-2443},
abstract={This paper presents a novel digit-level algorithm for motion estimation (ME) and its hardware implementations. It uses the most-significant-digit-first (MSD-first) processing and on-line arithmetic ME components. A dedicated array architecture is also proposed for applications with high-throughput ME. Various fast search algorithms were presented in literatures to reduce the complexity but sacrifice the motion vector (MV) quality. Our MSD-first ME decomposes the summation of absolute differences (SAD) and comparison operations to digit level with MSD-plane first. These comparisons are interleaved into SADs to distinguish the MV as soon as possible. The algorithm precisely extracts the impossible candidates and removes their rest operations. It saves 47.4 % to 64.3 % of SAD computations in full search block matching (FSBM) ME. In the past, the high implementation cost of redundant number system prevented the practical use of on-line arithmetic. Besides, the redundant SAD removal results in irregular data flow in system-level integration. All these problems are solved by our novel architecture design. In this paper, we propose novel architecture designs to solve these problems. Besides, the architecture requires only one memory access per pixel to lower memory bandwidth by extensive data parallelism and a particular memory addressing while keeping the controller simple. A 4 4 array processor is implemented in 0.35 µm 1P4M CMOS cell library, with 2.84 ns cycle time and 1510 gates. It can support 83 M FSBM operations per second. After normalization, our implementation can support 2.67 times SAD operations per unit area (estimated in gate count) of the conventional two's complement ones. MSD-first ME can realize with other ME algorithms to improve the performance as well.},
keywords={},
doi={},
ISSN={},
month={November},}

Copy

TY - JOUR
TI - MSD-First On-Line Arithmetic Progressive Processing Implementation for Motion Estimation
T2 - IEICE TRANSACTIONS on Information
SP - 2433
EP - 2443
AU - Ching-Long SU
AU - Chein-Wei JEN
PY - 2003
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E86-D
IS - 11
JA - IEICE TRANSACTIONS on Information
Y1 - November 2003
AB - This paper presents a novel digit-level algorithm for motion estimation (ME) and its hardware implementations. It uses the most-significant-digit-first (MSD-first) processing and on-line arithmetic ME components. A dedicated array architecture is also proposed for applications with high-throughput ME. Various fast search algorithms were presented in literatures to reduce the complexity but sacrifice the motion vector (MV) quality. Our MSD-first ME decomposes the summation of absolute differences (SAD) and comparison operations to digit level with MSD-plane first. These comparisons are interleaved into SADs to distinguish the MV as soon as possible. The algorithm precisely extracts the impossible candidates and removes their rest operations. It saves 47.4 % to 64.3 % of SAD computations in full search block matching (FSBM) ME. In the past, the high implementation cost of redundant number system prevented the practical use of on-line arithmetic. Besides, the redundant SAD removal results in irregular data flow in system-level integration. All these problems are solved by our novel architecture design. In this paper, we propose novel architecture designs to solve these problems. Besides, the architecture requires only one memory access per pixel to lower memory bandwidth by extensive data parallelism and a particular memory addressing while keeping the controller simple. A 4 4 array processor is implemented in 0.35 µm 1P4M CMOS cell library, with 2.84 ns cycle time and 1510 gates. It can support 83 M FSBM operations per second. After normalization, our implementation can support 2.67 times SAD operations per unit area (estimated in gate count) of the conventional two's complement ones. MSD-first ME can realize with other ME algorithms to improve the performance as well.
ER -