1-10hit |
Mengmeng ZHANG Chuan ZHOU Jizheng XU
The High efficiency video coding (HEVC) standard defines two in-loop filters to improve the objective and subjective quality of the reconstructed frames. Through analyzing the effectiveness of the in-loop filters, it is noted that band offset (BO) process achieves much more coding gains for text region which mostly employ intra block copy (IntraBC) prediction mode. The intraBC prediction process in HEVC is performed by using the already reconstructed region for block matching, which is similar to motion compensation. If BO process is applied after one coding tree unit (CTU) encoded, the distortion between original and reconstructed samples copied by the IntraBC prediction will be further reduced, which is simple to operate and can obtain good coding efficiency. Experimental results show that the proposed scheme achieves up to 3.4% BD-rate reduction in All-intra (AI) for screen content sequences with encoding and decoding time no increase.
Muchen LI Jinjia ZHOU Dajiang ZHOU Xiao PENG Satoshi GOTO
As the successive video compression standard of H.264/AVC, High Efficiency Video Codec (HEVC) will play an important role in video coding area. In the deblocking filter part, HEVC inherits the basic property of H.264/AVC and gives some new features. Based on this variation, this paper introduces a novel dual-mode deblocking filter architecture which could support both of the HEVC and H.264/AVC standards. For HEVC standard, the proposed symmetric unified-cross unit (SUCU) based filtering scheme greatly reduces the design complexity. As a result, processing a 1616 block needs 24 clock cycles. For H.264/AVC standard, it takes 48 clock cycles for a 1616 macro-block (MB). In synthesis result, the proposed architecture occupies 41.6k equivalent gate count at frequency of 200 MHz in SMIC 65 nm library, which could satisfy the throughput requirement of super hi-vision (SHV) on 60 fps. With filter reusing scheme, the universal design for the two standards saves 30% gate counts than the dedicated ones in filter part. In addition, the total power consumption could be reduced by 57.2% with skipping mode when the edges need not be filtered.
Luong Pham VAN Hoyoung LEE Jaehwan KIM Byeungwoo JEON
Blocking artifacts are introduced in many block-based coding systems, and its reduction can significantly improve the subjective quality of compressed video. The H.264/AVC uses an in-loop deblocking filter to remove the blocking artifacts. The filter considers some coding conditions in its adaptive deblocking filtering such as coded block pattern (CBP), motion vector, macroblock type, etc. for inter-predicted blocks, however, it does not consider much for intra-coded blocks. In this paper, we utilize the human visual system (HVS) characteristic and the local characteristic of image blocks to modify the boundary strength (BS) of the intra-deblocking filter in order to gain improvement in the subjective quality and also to reduce the complexity in filtering intra coded slices. In addition, we propose a low-complexity deblocking method which utilizes the correlation between vertical and horizontal boundaries of a block in inter coded slices. Experimental results show that our proposed method achieves not only significant gain in the subjective quality but also some PSNR gain, and reduces the computational complexity of the deblocking filter by 36.23% on average.
Weiwei SHEN Yibo FAN Xiaoyang ZENG
In this paper, a high-throughput debloking filter is presented for H.264/AVC standard, catering video applications with 4 K2 K (40962304) ultra-definition resolution. In order to strengthen the parallelism without simply increasing the area, we propose a luma-chroma parallel method. Meanwhile, this work reduces the number of processing cycles, the amount of external memory traffic and the working frequency, by using triple four-stage pipeline filters and a luma-chroma interlaced sequence. Furthermore, it eliminates most unnecessary off-chip memory bandwidth with a highly reusable memory scheme, and adopts a “slide window” buffer scheme. As a result, our design can support 4 K2 K at 30 fps applications at the working frequency of only 70.8 MHz.
Nasharuddin ZAINAL Toshihisa TANAKA Yukihiko YAMASHITA
We propose a moving picture coding by lapped transform and an edge adaptive deblocking filter to reduce the blocking distortion. We apply subband coding (SBC) with lapped transform (LT) and zero pruning set partitioning in hierarchical trees (zpSPIHT) to encode the difference picture. Effective coding using zpSPIHT was achieved by quantizing and pruning the quantized zeros. The blocking distortion caused by block motion compensated prediction is reduced by an edge adaptive deblocking filter. Since the original edges can be detected precisely at the reference picture, an edge adaptive deblocking filter on the predicted picture is very effective. Experimental results show that blocking distortion has been visually reduced at very low bit rate coding and better PSNRs of about 1.0 dB was achieved.
In this paper, we propose memory and performance optimized architecture to accelerate the operation speed of adaptive deblocking filter for H.264/JVT/AVC video coding. The proposed deblocking filter executes loading/storing and filtering operations with only 192 cycles for 1 macroblock. Only 244 internal buffers and 3216 internal SRAM are adopted for the buffering operation of deblocking filter with I/O bandwidth of 32 bit. The proposed architecture can process the filtering operation for 1 macroblock with less filtering cycles and lower memory sizes than some conventional approaches of realizing deblocking filter. The efficient hardware architecture is implemented with novel data arrangement, hybrid filter scheduling and minimum number of buffer. The proposed architecture is suitable for low cost and real-time applications, and the real-time decoding with 1080HD (19201088@30 fps) can be easily achieved when working frequency is 70 MHz.
In this paper, we study and analyze the computational complexity of deblocking filter in H.264/AVC baseline decoder based on SimpleScalar/ARM simulator. The simulation result shows that the memory reference, content activity check operations, and filter operations are known to be very time consuming in the decoder of this new video coding standard. In order to improve overall system performance, we propose a novel processing order with efficient VLSI architecture which simultaneously processes the horizontal filtering of vertical edge and vertical filtering of horizontal edge. As a result, the memory performance of the proposed architecture is improved by four times when compared to the software implementation. Moreover, the system performance of our design significantly outperforms the previous proposals.
We devised an efficient architecture of deblocking filter and implemented the circuit with 15,400 logic gates and a 16032 dual-port SRAM using 0.25 µm standard cell technology. This circuit can process 88 image frames with 1,280720 pixels per second at 166 MHz. Our circuit requires smaller number of accesses to the external memory than other approaches and hence causes less bus traffic in the SoC design platform.
The deblocking filter in H.264 is an efficient tool to reduce blocking artifact, but it also blurs the details or retains blocking artifact perceptible in some high-activity areas. In this paper, we improve the filtered pixel classification and filtering schemes used by the deblocking filter in H.264 to keep the sharpeness of real edges and minimize over-smoothing.
Lingfeng LI Satoshi GOTO Takeshi IKENAGA
This paper presents a highly parallel architecture for deblocking filter in H.264/AVC. We adopt various parallel schemes in memory sub-system and datapath. A 2-dimensional parallel memory scheme is employed to support efficient parallel access in both horizontal and vertical directions in order to speed up the whole filtering process. This parallel memory also eliminates the need for a transpose circuit. In the datapath, an algorithm optimization is performed to implement parallel filtering with hardware reuse. Pipeline techniques are also adopted to improve the throughput of filtering operations. Our design is implemented under TSMC 0.18 µm technology. Results show that the core size is 0.821.13 mm2 when the maximum frequency is 230 MHz. Compared to other existing architectures, our design has advantages in both speed and area.