IEICE global.ieice.org Site

Keyword Search Result

[Keyword] h.264(137hit)

121-137hit(137hit)

Scalable VLSI Architecture for Variable Block Size Integer Motion Estimation in H.264/AVC
Yang SONG Zhenyu LIU Satoshi GOTO Takeshi IKENAGA

PAPER

Vol:
E89-A No:4
Page(s):
979-988
Because of the data correlation in the motion estimation (ME) algorithm of H.264/AVC reference software, it is difficult to implement an efficient ME hardware architecture. In order to make parallel processing feasible, four modified hardware friendly ME workflows are proposed in this paper. Based on these workflows, a scalable full search ME architecture is presented, which has following characteristics: (1) The sum of absolute differences (SAD) results of 44 sub-blocks is accumulated and reused to calculate SADs of bigger sub-blocks. (2) The number of PE groups is configurable. For a search range of MN pixels, where M is width and N is height, up to M PE groups can be configured to work in parallel with a peak processing speed of N16 clock cycles to fulfill a full search variable block size ME (VBSME). (3) Only conventional single port SRAM is required, which makes this architecture suitable for standard-cell-based implementation. A design with 8 PE groups has been realized with TSMC 0.18 µm CMOS technology. The core area is 2.13 mm1.60 mm and clock frequency is 228 MHz in typical condition (1.8 V, 25).
Low Dynamic Power and Low Leakage Power Techniques for CMOS Motion Estimation Circuits
Nobuaki KOBAYASHI Tomomi EI Tadayoshi ENOMOTO

PAPER-Low Power Techniques

Vol:
E89-C No:3
Page(s):
271-279
To drastically reduce the dynamic power (PAT) and the leakage power (PST) of the CMOS MPEG4/H.264 motion estimation (ME) circuits, several power reduction techniques were developed. They were circuit architectures, which were able to reduce the supply voltages (VDD) and numbers of logic gates of not only the whole circuit but the critical path, a fast motion estimation algorithm, and a leakage current reduction circuit. A 0.18-µm CMOS ME circuit has been fabricated by adopting those techniques. At a clock frequency of 160 MHz and VDD of 1.25 V, PAT decreased to 75.9 µW, which was 5.35% that of a conventional ME circuit. PST also decreased to 0.82 nW, which was 3.93% that of the conventional ME circuit.
Fast Multiple Reference Frame Selection Method Using Correlation of Sequence in JVT/H.264
Jae-Sik SOHN Duk-Gyoo KIM

LETTER-Image/Vision Processing

Vol:
E89-A No:3
Page(s):
744-746
H.264 video coding standard has a significant performance better than the other standards are the adoption of variable block sizes, multiple reference frames, and the consideration of rate distortion optimization within the codec. However, these features incur a considerable complexity in the encoder for motion estimation. As for the multiple reference frames motion estimation, the increased computation is in proportion to the number of searched reference frames. In this paper, a fast multiple frame reference frames selection method is proposed for H.264 video coding. The proposed algorithm can efficiently determine the best reference frame from the allowed five reference frames. As determine the number of reference frames to search the motion using the correlation of the different block between the block of current frame and that of previous frame, this scheme can efficiently reduce the computational cost while keeping the similar quality and bit-rate. Simulation results show that the speed of the proposed method is faster than that of the original scheme adapted in JVT reference software JM95 while keeping the similar video quality and bit-rate.
Module-Wise Dynamic Voltage and Frequency Scaling for a 90 nm H.264/MPEG-4 Codec LSI
Yukihito OOWAKI Shinichiro SHIRATAKE Toshihide FUJIYOSHI Mototsugu HAMADA Fumitoshi HATORI Masami MURAKATA Masafumi TAKAHASHI

INVITED PAPER

Vol:
E89-C No:3
Page(s):
263-270
The module-wise dynamic voltage and frequency scaling (MDVFS) scheme is applied to a single-chip H.264/MPEG-4 audio/visual codec LSI. The power consumption of the target module with controlled supply voltage and frequency is reduced by 40% in comparison with the operation without voltage or frequency scaling. The consumed power of the chip is 63 mW in decoding QVGA H.264 video at 15 fps and MPEG-4 AAC LC audio simultaneously. This LSI keep operating continuously even during the voltage transition of the target module by introducing the newly developed dynamic de-skewing system (DDS) which watches and control the clock edge of the target module.
Foveation Based Error Resilience Optimization for H.264 Intra Coded Frame in Wireless Communication
Yu CHEN XuDong ZHANG DeSheng WANG

LETTER-Multimedia Systems for Communications" Multimedia Systems for Communications

Vol:
E89-B No:2
Page(s):
633-636
Based on the observation that foveation analysis can be used to find most critical content in terms of human visual perception in video and image, one effective error resilience method is proposed for robust transmission of H.264 intra-coded frame in wireless channel. It firstly exploits the results of foveation analysis to find foveated area in picture, and then considers the results of pre-error concealment effect analysis to search for the center of foveation macro-blocks (MB) in foveated area, finally new error resilient alignment order of MB and new coding order of MB are proposed that are used in video encoder. Extensive experimental results on different portrait video sequences over random bit error wireless channel demonstrate that this proposed method can achieve better subjective and objective effect than original JM 8.2 H.264 video codec with little effect on coding rate and image quality.
Subjective Quality Assessment of the H.264/AVC In-Loop De-Blocking Filter Open Access
Matthew D. BROTHERTON Damien BAYART David S. HANDS

INVITED PAPER

Vol:
E89-B No:2
Page(s):
273-280
Next generation codecs, benchmarked by the H.264/AVC standard, are providing substantial compression efficiency for the coding and transmission of video. Coupled with technologies offering larger transmission bandwidths over DSL, wireless and satellite networks, the capability of delivering high quality video services to the home is now a reality. The perceptual quality of the content delivered over communications networks will be crucial in ensuring a first-class customer experience. It is therefore important to assess the advantages and disadvantages of the optional features offered by next generation codecs. This paper describes a subjective assessment that was carried out to investigate the perceptual effects of switching the in loop de-blocking filter within the H.264/ AVC CODEC on or off. Although the filter is believed to substantially improve the perceptual quality of video, it has been suggested that in some cases negative perceptual effects can be produced. The H.264/AVC architecture allows de-blocking to be switched off in cases where there are limited processing resources or it is considered a negative perceptual effect may be introduced. This paper describes a study that examined the perceptual effects of de-blocking by employing a standardised subjective assessment methodology. The Absolute Category Rating (ACR) method was used to capture Difference Mean Opinion Scores (DMOS) for a range of video. Content was selected to span a wide and representative range of coding complexity. This content was then encoded at a variety of bit-rates to represent high, medium and low qualities. Results were used to examine the end-user perception of video quality when the de-blocking filter is switched on or off. The experimental design allowed the overall effects of the de-blocking filter to be examined and additionally the relationship between content and quality on the filter performance. The experiment found that the performance of the de-blocking filter was content-dependent. Results were used to discuss the advantages and disadvantages of in-loop de-blocking and there is an examination of content properties (e.g. spatial and temporal complexity) that influence the performance of de-blocking.
Adaptive Search Range Decision and Early Termination for Multiple Reference Frame Motion Estimation for H.264
Gwo-Long LI Mei-Juan CHEN

LETTER-Multimedia Systems for Communications" Multimedia Systems for Communications

Vol:
E89-B No:1
Page(s):
250-253
The newest video coding standard called H.264 provides considerable performance improvement over a wide range of bit rates and video resolutions compared to previous standards. However, these features result in an extraordinary increase in encoder complexity, mainly regarding to mode decision and multiple reference frame motion estimation (ME). This letter presents two algorithms to reduce the computational complexity caused by motion estimation. The adaptive search range decision method determines the search range size according to the motion vector predictor dynamically and the early termination scheme defines a criterion to early terminate the search processing for multiple reference frames. Experimental results show that the proposed algorithms provide significant improvement of coding speed with negligible objective quality degradation compared to the fast motion estimation algorithms adopted by reference software.
High Performance Adaptive Deblocking Filter for H.264
Yu-Ching CHU Mei-Juan CHEN

LETTER-Image Processing and Video Processing

Vol:
E89-D No:1
Page(s):
367-371
The deblocking filter in H.264 is an efficient tool to reduce blocking artifact, but it also blurs the details or retains blocking artifact perceptible in some high-activity areas. In this paper, we improve the filtered pixel classification and filtering schemes used by the deblocking filter in H.264 to keep the sharpeness of real edges and minimize over-smoothing.
Efficient Motion Estimation Using a Modified Early Termination Algorithm in H.264
Sung-Eun KIM Jong-Ki HAN

PAPER-Image Processing and Video Processing

Vol:
E88-D No:7
Page(s):
1707-1715
In the H.264 video coding standard, 7 modes {1616, 168, 816, 88, 84, 48, 44} are used to enhance the coding efficiency. The motion vector estimation with 7 modes may require huge computing time. Thus, several efficient ME schemes have been proposed to reduce the complexity of ME module in H.264. In this paper, we propose a ME scheme using a modified early termination technique to speed up the motion vector estimation procedure while maintaining high image quality. We demonstrate the effectiveness of the proposed method by computer simulation. In the simulation results, the CPU time consumed by the proposed scheme is much less than that of the conventional scheme while the encoded video quality remains unchanged. This is due to the fact that the proposed scheme searches MVs from the smallest block mode to the largest block mode, and utilizes the correlation between neighbor MVs. Furthermore, the process of the proposed ME scheme can bypass to the next mode when the MVs of a mode are highly correlated with each other, while the conventional schemes can not skip to other modes.
A Highly Parallel Architecture for Deblocking Filter in H.264/AVC
Lingfeng LI Satoshi GOTO Takeshi IKENAGA

PAPER-Parallel and/or Distributed Processing Systems

Vol:
E88-D No:7
Page(s):
1623-1629
This paper presents a highly parallel architecture for deblocking filter in H.264/AVC. We adopt various parallel schemes in memory sub-system and datapath. A 2-dimensional parallel memory scheme is employed to support efficient parallel access in both horizontal and vertical directions in order to speed up the whole filtering process. This parallel memory also eliminates the need for a transpose circuit. In the datapath, an algorithm optimization is performed to implement parallel filtering with hardware reuse. Pipeline techniques are also adopted to improve the throughput of filtering operations. Our design is implemented under TSMC 0.18 µm technology. Results show that the core size is 0.821.13 mm2 when the maximum frequency is 230 MHz. Compared to other existing architectures, our design has advantages in both speed and area.
An Efficient Matrix-Based 2-D DCT Splitter and Merger for SIMD Instructions
Yuh-Jue CHUANG Ja-Ling WU

PAPER-Image Processing and Multimedia Systems

Vol:
E88-D No:7
Page(s):
1569-1577
Recent microprocessors have included SIMD (single instruction multiple data) extensions into their instruction set architecture to improve the performance of multimedia applications. SIMD instructions speed up the execution of programs but pose lots of challenges to software developers. An efficient matrix-based splitter (or merger), which can split an N N 2-D DCT block into four N/2 N/2 or two N N/2 (or N/2 N) 2-D DCT blocks (or merger small size blocks into a large size one), specialized for SIMD architectures is presented in this paper. The programming-level complexity of the proposed methods is lower than that of the direct approach. Furthermore, even without using SIMD instructions, the algorithmic-level complexity of the proposed DCT splitter/merger is still lower than that of the direct one and is the same as that of the most efficient approach existed in the literature. When N = 8, our method can be applied to act as a transcoder between the latest video coding standards AVC/H.264 and the older ones, such as MPEG-1, MPEG-2 and MPEG-4 part 2. We also provide the image quality tests to show the performance of the proposed 2-D DCT splitter and merger.
Quantization/DCT Conversion Scheme for DCT-Domain MPEG-2 to H.264/AVC Transcoding
Joo-Kyong LEE Ki-Dong CHUNG

PAPER

Vol:
E88-B No:7
Page(s):
2856-2863
The latest video coding standard, H.264/AVC, adopts 44 approximate transform instead of 88 discrete cosine transform (DCT) to avoid the inverse transform mismatch problem. However, that is only one of the factors that make it difficult to transcode pre-coded video contents with the previous standards to H.264/AVC in the common domain without causing cascaded pixel-domain transcoding. In this paper, to support the existent DCT-domain transcoding schemes and to reduce computational complexity, we propose an efficient algorithm that converts the quantized 88 DCT block into four newly quantized 44 transformed blocks. The experimental results show that the proposed scheme reduces computational complexity by 5-11% and improves video quality by 0.1-0.5 dB compared with the cascaded pixel-domain transcoding scheme that exploits inverse quantization (IQ), inverse DCT (IDCT), DCT, and re-quantization (re-Q).
Self-Adaptive Algorithmic/Architectural Design for Real-Time, Low-Power Video Systems
Luca FANUCCI Sergio SAPONARA Massimiliano MELANI Pierangelo TERRENI

PAPER-Adaptive Signal Processing

Vol:
E88-D No:7
Page(s):
1538-1545
With reference to video motion estimation in the framework of the new H.264/AVC video coding standard, this paper presents algorithmic and architectural solutions for the implementation of context-aware coprocessors in real-time, low-power embedded systems. A low-complexity context-aware controller is added to a conventional Full Search (FS) motion estimation engine. While the FS coprocessor is working, the context-aware controller extracts from the intermediate processing results information related to the input signal statistics in order to automatically configure the coprocessor itself in terms of search area size and number of reference frames; thus unnecessary computations and memory accesses can be avoided. The achieved complexity saving factor ranges from 2.2 to 25 depending on the input signal while keeping unaltered performance in terms of motion estimation accuracy. The increased efficiency is exploited both for (i) processing time reduction in case of software implementation on a programmable platform; (ii) power consumption reduction in case of dedicated hardware implementation in CMOS technology.
A Low-Power Systolic Array Architecture for Block-Matching Motion Estimation
Junichi MIYAKOSHI Yuichiro MURACHI Koji HAMANO Tetsuro MATSUNO Masayuki MIYAMA Masahiko YOSHIMOTO

PAPER-Digital

Vol:
E88-C No:4
Page(s):
559-569
This paper proposes a low-power systolic array architecture for a block-matching motion estimation processor IP for portable and high-resolution video applications. The architecture features a ring-connected processing element (PE) array to reduce both computation cycles and memory access cycles at the same time, allowing lower power characteristics. The feature of low memory access cycles allows concurrent operation of a half-pel processing unit with no extra cache. Furthermore, the architecture allows various summation schemes for absolute difference values. For that reason, it is applicable to various video coding modes such as the adaptive field/frame mode in MPEG2 and multiple macroblock mode in H.264. When the architecture is introduced to a design of a MPEG2 MP@HL motion estimation processor VLSI, the power consumption of the VLSI is reduced by 45-73% in comparison to cases with conventional architectures for motion estimation.
Fast Macroblock Mode Determination to Reduce H.264 Complexity
Ki-Hun HAN Yung-Lyul LEE

LETTER-Image

Vol:
E88-A No:3
Page(s):
800-804
The rate-distortion optimization (RDO) method is an informative technology that improves the coding efficiency, but increases the computational complexity, of the H.264 encoder. In this letter, a fast Macroblock mode determination algorithm is proposed to reduce the computational complexity of the H.264 encoder. The proposed method reduces the encoder complexity by 55%, while maintaining the same level of coding efficiency.
Fast Reference Frame Selection Method for Motion Estimation in JVT/H.264
Ching-Ting HSU Hung-Ju LI Mei-Juan CHEN

LETTER-Terminals for Communications

Vol:
E87-B No:12
Page(s):
3827-3830
The three main reasons why the new H.264 (MPEG-4 AVC) video coding standard has a significant performance better than the other standards are the adoption of variable block sizes, multiple reference frames, and the consideration of rate distortion optimization within the codec. However, these features incur a considerable increase in encoder complexity. As for the multiple reference frames motion estimation, the increased computation is in proportion to the number of searched reference frames. In this paper, a fast multi-frame selection method is proposed for H.264 video coding. The proposed scheme can efficiently determine the best reference frame from the allowed five reference frames. Simulation results show that the speed of the proposed method is over two times faster than that of the original scheme adopted in JVT reference software JM73 while keeping the similar video quality and bit-rate.
Rate Distortion Optimized Coding Mode Selection for H.264/AVC in Wireless Environments
Wei ZHANG Yuanhua ZHOU

LETTER-Multimedia Systems

Vol:
E87-B No:7
Page(s):
2057-2060
A flexible and robust rate-distortion optimization algorithm is presented to select macroblock coding mode for H.264/AVC transmission over wireless channels subject to burst errors. A two-state Markov model is used to describe the burst errors on the packet level. With the feedback information from the receiver and the estimation of the channel errors, the algorithm analyzes the distortion of the reconstructed macroblock at the decoder due to the channel errors and spatial and temporal error propagation. The optimal coding mode is chosen for each macroblock in rate-distortion (R-D)-based framework. Experimental results using the H.264/AVC test model show a significant performance of resilience to the burst errors.

121-137hit(137hit)

Keyword Search Result

[Keyword] h.264(137hit)

Scalable VLSI Architecture for Variable Block Size Integer Motion Estimation in H.264/AVC

Low Dynamic Power and Low Leakage Power Techniques for CMOS Motion Estimation Circuits

Fast Multiple Reference Frame Selection Method Using Correlation of Sequence in JVT/H.264

Module-Wise Dynamic Voltage and Frequency Scaling for a 90 nm H.264/MPEG-4 Codec LSI

Foveation Based Error Resilience Optimization for H.264 Intra Coded Frame in Wireless Communication

Subjective Quality Assessment of the H.264/AVC In-Loop De-Blocking Filter Open Access

Adaptive Search Range Decision and Early Termination for Multiple Reference Frame Motion Estimation for H.264

High Performance Adaptive Deblocking Filter for H.264

Efficient Motion Estimation Using a Modified Early Termination Algorithm in H.264

A Highly Parallel Architecture for Deblocking Filter in H.264/AVC

An Efficient Matrix-Based 2-D DCT Splitter and Merger for SIMD Instructions

Quantization/DCT Conversion Scheme for DCT-Domain MPEG-2 to H.264/AVC Transcoding

Self-Adaptive Algorithmic/Architectural Design for Real-Time, Low-Power Video Systems

A Low-Power Systolic Array Architecture for Block-Matching Motion Estimation

Fast Macroblock Mode Determination to Reduce H.264 Complexity

Fast Reference Frame Selection Method for Motion Estimation in JVT/H.264

Rate Distortion Optimized Coding Mode Selection for H.264/AVC in Wireless Environments

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles