1-7hit |
Dohyeon PARK Jinho LEE Jung-Won KANG Jae-Gon KIM
The emerging Versatile Video Coding (VVC) standard currently adopts Triangular Partitioning Mode (TPM) to make more flexible inter prediction. Due to the motion search and motion storage for TPM, the complexity of the encoder and decoder is significantly increased. This letter proposes two simplifications of TPM for reducing the complexity of the current design. One simplification is to reduce the number of combinations of motion vectors for both partitions to be checked. The method gives 4% encoding time decrease with negligible BD-rate loss. Another one is to remove the reference picture remapping process in the motion vector storage of TPM. It reduces the complexity of the encoder and decoder without a BD-rate change for the random-access configuration.
Chan-Hee HAN Si-Woong LEE Hamid GHOLAMHOSSEINI Yun-Ho KO
In this paper, side information refinement methods for Wyner-Ziv video codec are presented. In the proposed method, each block of a Wyner-Ziv frame is separated into a predefined number of groups, and these groups are interleaved to be coded. The side information for the first group is generated by the motion compensated temporal interpolation using adjacent key frames only. Then, the side information for remaining groups is gradually refined using the knowledge of the already decoded signal of the current Wyner-Ziv frame. Based on this basic concept, two progressive side information refinement methods are proposed. One is the band-wise side information refinement (BW-SIR) method which is based on transform domain interleaving, while the other is the field-wise side information refinement (FW-SIR) method which is based on pixel domain interleaving. Simulation results show that the proposed methods improve the quality of the side information and rate-distortion performance compared to the conventional side information refinement methods.
This paper describes the architecture of an integrated platform developed for improving the development efficiency of system LSIs built into digital consumer electronics equipment such as flat-panel TVs and optical disc recorders. The reason for developing an integrated platform is to improve the development efficiency of system LSIs that serve the principal functions of the said equipment. The key is to build a common interface between each software layer, with the system LSI located at the lowest layer. To make this possible, the hardware architecture of the system LSI is divided into five blocks according to its main functionality. In addition, a middleware layer is placed over the operating system to improve the ease of porting old applications and developing new applications in the higher layer. Based on this platform, a system LSI called UniPhierTM has been developed and used in 156 product families of digital consumer electronics equipment (as of December 2008).
Seongmo PARK Miyoung LEE KyoungSeon SHIN Hanjin CHO Jongdae KIM Dukdong LEE
In this paper, we present a design of MPEG-4 video codec chip to reduce the power consumption using frame level clock gating, macro block level and motion estimation skip scheme. It performs 30 frames/s of codec (encoding and decoding) mode with quarter-common intermediate format (QCIF) at 27 MHz. Power consumption is 290 mW at 27 MHz operation, which is achieving 35% power saving compared to a conventional CMOS. Motion Estimation skip method is employed to reduce 32% computation load. This chip performs MPEG-4 Simple Profile Level 2 (Simple@L2) and H. 263 base mode. Its contains 388,885 gates, 662 k bits memory, and the chip size was 9.7 mm9.7 mm which was fabricated using 0.35 micron 3-layers metal CMOS technology.
Discussed here is reduction of power dissipation for multi-media LSIs. First, both active power dissipation Pat and stand-by power dissipation Pst for both CMOS LSIs and GaAs LSIs are summarized. Then, general technologies for reducing Pat are discussed. Also reviewed are a wide variety of approaches (i.e., parallel and pipeline schemes, Chen's fast DCT algorithms, hierarchical search scheme for motion vectors, etc.) for reduction of Pat. The last part of the paper focuses on reduction of Pst. Reducing both Pat and Pst requires that both throughput and active chip areas be either maintained or improved.
Discussed here is progress achieved in the development of video codec LSIs.First, the amount of computation for various standards, and signal handling capability (throughput) and power dissipation for video codec LSIs are described. Then, general technologies for improving throughtput are briefly summarized. The paper also reviews three approaches (i.e., video signal processor, building block and monolithic codes) for implementing video codes standards. The second half of the paper discusses various high-throughput technologies developed for programmable Video Signal Processor (VSP) LSIs. A number of VSP LSIs are introduced, including the world's first programmable VSP, developed in February 1987 and a monolithic codec ship, built in February 1993 that is sufficient in itself for the construction of a video encoder for encoding full-CIF data at 30 frames per second. Technologies for reduction of power dissipation while keeping maintaining throughput are also discussed.
Yoshinori TAKEUCHI Zhao-Chen HUANG Masatomo SAEKI Hiroaki KUNIEDA
This paper introduces the new application specific architecture RHINE (Reconfigurable Hierarchical Image Neo-multiprocessor Engine) that is a multiprocessor system for moving picture CODEC. The array processor is known to be originally suited for data parallel processing such as image signal processing which requires vast amount of computations and has the identical instruction sequences on data. However, the moving picture CODEC algorithm suffers from the large load imbalance in the processings on multi-processors with the separated sub-images. Some load balancing techniques are indispensable in such applications for the highest speed-up. RHINE gives one of the optimal solutions for such a load balancing due to its feature of the self reconfigurable architecture. RHINE consists of Block Processing Units (BPU) hierarchically, in each of which has a common bus architecture of multiprocessors with a block memory. Processors in a BPU move to the other BPU according to the load imbalance between BPUs by switching the bus connection between BPUs. The advantage of RHINE architecture is demonstrated by showing performance simulations for real moving pictures.