Hidenori SATO Hiroaki MATSUDA Akira ONOZAWA
This paper presents a clock routing technique called Balanced-Mesh Method (BMM) which incorporates the advantages of two famous conventional-clock-routing techniques. One is the balanced-tree method (BTM) where the clock net is routed as a tree so that the delay times of clock signal are balanced, and the other is the fixed-mesh method (FMM) where the clock net is routed as a fixed mesh driven by a large buffer. In BMM, the clock net is routed as a set of relatively small meshes of interconnects driven by relatively small buffers. Each mesh covers an area called a Mesh-Routing Region (MR) in which its delay and skew can be suppressed within a certain range. These small meshes are connected by a balanced tree with the chip clock source as its root. To implement BMM, we developed an MR-partitioning program that partitions the circuit into MR's according to a set of pre-determined constraints on the number of flip-flops and the area in each MR, and a clock-global-routing program that provides each mesh routing and the tree routing connecting meshes. We applied BMM to the design of an MPEG2-encoder LSI and achieved a skew of 210ps. In addition, the experimental results show BMM yields the lowest power dissipation compared to conventional methods.
Discussed here is reduction of power dissipation for multi-media LSIs. First, both active power dissipation Pat and stand-by power dissipation Pst for both CMOS LSIs and GaAs LSIs are summarized. Then, general technologies for reducing Pat are discussed. Also reviewed are a wide variety of approaches (i.e., parallel and pipeline schemes, Chen's fast DCT algorithms, hierarchical search scheme for motion vectors, etc.) for reduction of Pat. The last part of the paper focuses on reduction of Pst. Reducing both Pat and Pst requires that both throughput and active chip areas be either maintained or improved.
Thomas S. HUANG James W. STROMING Yi KANG Ricardo LOPEZ
Research in very low-bit rate coding has made significant advancements in the past few years. Most recently, the introduction of the MPEG-4 proposal has motivated a wide variety of a approaches aimed at achieving a new level of video compression. In this paper we review progress in VLBV categorized into 3 main areas. (1) Waveform coding, (2) 2D Content-based coding, and (3) Model-based coding. Where appropriate we also described proposals to the MPEG-4 committee in each of these areas.
Takao ONOYE Gen FUJITA Masamichi TAKATSU Isao SHIRAKAWA Nariyoshi YAMAI
A single chip motion estimator is described dedicatedly for MPEG2 MP@HL moving pictures. Adopting a two-level hierarchical searching algorithm in detecting motion vectors, the computational labor can be reduced by 1/70 in comparison with the conventional algorithm. A novel mechanism is introduced into the full-search procedure, which attempts the maximum possible reuse of reference pixels in order to reduce the bandwidth of the frame memory interface. The proposed motion estimator is integrated in a 0.6 µm triple-metal CMOS chip, which contains 1,450 K transistors on a 12.713.7 mm2 die. The input clock rate can be attained up to 133 MHz, which enables the real time motion estimation for MPEG2 MP@HL.
John LAUDERDALE Danny H. K. TSANG
This paper presents the system issues involved with the transmission of pre-encoded VBR MPEG video using CBR service. Conventional wisdom suggests that lossless delivery of VBR video using CBR service requires bandwidth to be reserved at the peak rate resulting in low bandwidth utilization. We calculate the minimum rate at which bandwidth must be reserved on a network in order to provide continuous playback of an MPEG encoded video bitstream. Simulation results using the frame size traces from several pre-encoded MPEG bitstreams and several buffer sizes demonstrate that this minimum reservation rateis much lower than the peak rate when a relatively small playback buffer size is used, resulting in much higher bandwidth utilization. Procedures for performing connection setup and lossless realtime video playback between the video server and the client are outlined. Methods for incorporating VCR-like features such as pauseandfast forward/reversefor Video-on-Demand (VoD) applications are presented.
Young Tae HAN Jong-Seog KOH Soon Hong KWON Sun Kook YOO Dae Hee YOUN
This paper presents a design of an MPEG-2 layer 2 audio decoder that decodes signals of 5 channels (left, right, center, left surround, and right surround). It is backwards compatible with MPEG-1 decoder and its left/right channel coding supports stereo, dual channel, and single channel modes. The joint stereo mode supports four extension modes including Dolby Prologic decoding. Besides, the proposed system supports channel switching, dynamic crosstalk, and phantom coding modes to recover multichannel signal, and supports most options described in ISO/IEC 13818-3. The system has been implemented using FPGAs (Field Programmable Gate Array) which is easily mappable in ASIC, and proved to work as an excellent MPEG-2 multichannel audio decoder.
Hiroyuki HARA Masataka MATSUI Goichi OTOMO Katsuhiro SETA Takayasu SAKURAI
Special memory and embedded memories used in a newly designed MPEG2 decorder LSI are described. Orthogonal memory, which has a functionality of parallel-to-serial transposition, is employed in a IDCT(Inverse Discrete Cosine Transform) block for small area and low-power. The orthogonal memory realizes the special pupose with 50% of the area and the power compared with using flip-flop array. FIFO's and other dual-port memories are designed by using a single-port RAM operated twice in one clock cycle to reduce cost. Flip-Flop cell is one of the important memory elements in the MPEG environment, and is also improved for the low-cost optimizing functionality for video processing. The area and power of the fabricated MPEG2 decoder chip are reduced by 20% using these techniques. As for testability, direct test mode is implemented for small area. An instruction RAM is placed outside the pad area in parallel to a normal instruction ROM and activated by Al-masterslice for extensive debugging and an early sampling. Other memory related techniques and the key features of the decoder LSI are also described.
Hyun Duk CHO Sun CHOI Kyoung Won LIM Seong Deuk KIM Jong Beom RA
A region-based adaptive perceptual quantization technique is proposed for video sequence coding, and applied to the MPEG coder. The visibility of coding artifacts in a macroblock (MB) is affected by perceptual characteristics of neighboring MBs as well as the MB itself. Therefore spacial and temporal activities of the MB and its surroundings are used to decide the quantization scaling factor. In comparison with the adaptive scheme in the encoding algorithm specified in MPEG-2 Test Model 5 (TM5), the proposed scheme is proven to improve perceptual quality further in video coding.
Discussed here is progress achieved in the development of video codec LSIs.First, the amount of computation for various standards, and signal handling capability (throughput) and power dissipation for video codec LSIs are described. Then, general technologies for improving throughtput are briefly summarized. The paper also reviews three approaches (i.e., video signal processor, building block and monolithic codes) for implementing video codes standards. The second half of the paper discusses various high-throughput technologies developed for programmable Video Signal Processor (VSP) LSIs. A number of VSP LSIs are introduced, including the world's first programmable VSP, developed in February 1987 and a monolithic codec ship, built in February 1993 that is sufficient in itself for the construction of a video encoder for encoding full-CIF data at 30 frames per second. Technologies for reduction of power dissipation while keeping maintaining throughput are also discussed.
Takao ONOYE Toshihiro MASAKI Yasuo MORIMOTO Yoh SATO Isao SHIRAKAWA Kenji MATSUMURA
A single chip MPEG2 MP@HL Video decoder has been developed, which consists mainly of specific functional units and macroblock level pipeline buffers. A new organization is also devised for a set of off-chip frame memories and the interfaces associated with it. Owing to sophisticated I/O interfaces among functional units, the macroblock level pipeline in conjunction with different decording facilities attains a high throughput to such an extent as to decode HDTV images in real time. Moreover, a set of these functional units, pipeline buffers, and frame memory interfaces, together with a sequence controller, is integrated for the first time in a single chip, which has the total area of 8.8 9.2mm2 with a 0.6µm triple-mental CMOS technology, and dissipates 1.2 W from a single 3.3 V supply.
Masahiko YOSHIMOTO Shin-ichi NAKAGAWA Tetsuya MATSUMURA Kazuya ISHIHARA Shin-ichi URAMOTO
This paper will describe an overview on several design issues and solutions for the realization of MPEG2 encoder &decoder LSIs. ULSI technology and video-coding specific design have been able to actualize an MPEG2 encoder &decoder LSI with realtime capability, flexibility and cost effectiveness, though MPEG2 processing at MP@ML (Main Profile and Main Level) requires an enormous computation power of 10-200 GOPS depending on the motion estimation algorithm and a search range. Video coding processors, whose performance has been enhanced at the rate of one order per 3 years, have reached the performance level required to implement MPEG2 encoding using multiple chip configuration. This has been achieved by a hybrid architecture with video-oriented RISC and hardware engine optimized for coding algorithms. Intensive circuit optimization was carried out for transform coding such as DCT and predictive coding with motion estimation. Now cost effective MPEG2 decoders have begun to penetrate the multimedia market. There are two main design issues. One is the architectural and circuit design which minimizes the silicon area and power dissipation. The other is external DRAM control which makes use of DRAM storage and band width efficiently to reduce the system cost. Also future trends in a deep submicron era will be discussed. A single chip MPEG2 MP@ML encoder is expected to appear in the 0.25 micron era at the latest. An MPEG2 MP@ML decoder could be compressed to an area of about 25 mm2.
Naoya HAYASHI Toshiaki KITSUKI Ichiro TAMITANI Hideki HONMA Yasushi OOI Takashi MIYAZAKI Katsunari OOBUCHI
A motion compensation LSI for realtime MPEG1/H.261 video encoding has been developed. This LSI employs a compact motion estimator that consists of vector search array processors. Furthermore, an efficient motion vector search strategy that enables bidirectioanl searches with a -16.0/+15.5 pels range is adopted to maintain encoded picture quality. The adopted strategy takes two steps. The first step is the full search for 2-pel precision vectors within the range of 16 pels. A 4-to-1 sub-sampling technique with a low pass filter is employed in this step. The second step is the full search for half-pel precision vectors within a 1.0 pels search range centered on the location pointed by the best 2-pel precision vectors. This strategy is compared with the exhaustive-search strategy. It is shown that the number of operations and external memory access cycles are reduced to 1/11 and 1/2, respectively, while differences of the signal to noise ratios obtained by simulation are within 0.2 dB. Those reductions contribute to lowering power dissipation. The array processors calculate the values of distortion. They accumulate the absolute differences between current and reference data with a feedback loop to keep the number of processor elements equal to the number of pels in a row of the current block. Multiple reference data buses and a delay line in the feedback loop have been introduced for efficient calculation. In addition, cascade connection of the array processors is studied to shorten calculation periods. This LSI controls input frames reordering buffers and reference frames buffers. It generates the prediction and the prediction error blocks as well as the motion vectors. AC power of current blocks and the values of distortion are obtained for the bit rate control. This LSI is fabricated using 0.8 µm 2-level metal CMOS technology and dissipates 2.0 W from 5 V supply at 36 MHz.
Shin-ichi URAMOTO Akihiko TAKABATAKE Takashi HASHIMOTO Jun TAKEDA Gen-ichi TANAKA Tsuyoshi YAMADA Yukio KODAMA Atsushi MAEDA Toshiaki SHIMADA Shun-ichi SEKIGUCHI Tokumichi MURAKAMI Masahiko YOSHIMOTO
An MPEG2 video decoder LSI fully compliant with MPEG2 main profile at main level is described. The video decoder LSI is a single chip solution which can implement MPEG2 video decoding with conventional DRAMs. The LSI features an architecture based on dedicated decoding hardware so as to gain the necessary computational power for real-time processing of ITU-R R.601 size video. The variable length decoder (VLD), owing to our "one symbol decoding in one cycle" policy and a special circuit for detecting unique startcodes, achieved bitstream decoding up to 18 Mbps with a normal decoding process. It also realized fast searching for the next start-code in the picture skipping and error recovery processes. The video decoder LSI also features a hierarchical and adaptive control mechanism. This control mechanism decreases the dead time of the decoding circuits and raises the efficiency of data transfer via the local DRAM port. It also contributes to the realization of error concealment and error recovery processes. This chip is capable of processing NTSC-resolution video depicted in MPEG2 MP@ML in real-time at 27 MHz operation. The chip integrates about 1200 K transistors using 0.5 µm double metal CMOS technology. The feature of the hardware based architecture results in a low power dissipation, and the chip consumes a 1.4 W of power at 3.3 V supply voltage and is housed in a plastic QFP.
Kiyoshi MIURA Hideki KOYANAGI Hiroshi SUMIHIRO Seiichi EMOTO Nozomu OZAKI Toshiro ISHIKAWA
This paper describes a 600 mV single-chip MPEG2 video decoder, implemented in a 0.5 µm triple metal CMOS technology, which operates with a 3.3-volt power supply. To achieve low power consumption, a low power dual-port RAM has been developed utilizing a selective bit line precharge scheme to reduce bit line current which is suitable for use in the bit-slice array commonly found in parametric ASIC RAM macro modules. This architecture and a non-DC current sense amp make the RAM's read power consumption one-third of that of a conventional dual-port RAM. Various techniques such as multiple-clock architecture and a system clock independent from a display clock make a system clock frequency as low as possible. The video decoder has a syntax parser, so that it can handle the higher syntactic elements of MPEG2 bit streams without any host processor and decode the Main profile at Main level of MPEG2 bit streams.
Jean-Lien C. WU Yen-Wen CHEN Kuo-Chih JIANG
In this paper, two models are proposed for the simulation of MPEG video sources in ATM networks. The projected autoregressive (PAR) model is based on the autoregressive (AR) model compensated by a projection function. The projection function is capable of adjusting the histogram generated by the AR model so that it better fits the histogram obtained from real data. The state transition (ST) model is developed on die basis of recording the variation of frame size in a video sequence. Each state denotes the size of a frame and the number of state depends on the degree of correlation between frames. Our results show that the histogram generated by the ST model is almost identical with that of the real data and the PAR model performs better in capturing the property of autocorrelation of real data. When compared with other models, both of the two models demonstrate an excellent property of fitting the complex histogram curve, which was not achieved by the AR model, and preserving the correlation characteristics. A heuristic search algorithm is also proposed to make our modelling processes more efficient.
Video compression technologies such as MPEG have enabled the efficient use of video data in the computer environment. However, the compressed video information still has a huge amount of data compared with the other media such as text, audio, and graphics. Therefore, it is very important to handle the video information in a networked database for the efficient use of resources like storage media. Furthermore, in the networked database, its retrieval methods including search and delivery become the key issues especially for the video information which requires a large network bandwidth. In this paper, a video browsing method using an automatic fast scene cut detection for networked video database access is described. The scene cut is defined as the scene change frame and is detected by temporal change in interframe luminance difference and chrominance correlation which are obtained from spatio-temporally scaled image directly extracted from the MPEG compressed video without any complex processing of video decoding. The detected scene change frames are further investigated to exploit the relationship between the scene cuts and are classified in order to make a hierarchical indexing. These results of detection are stored as an scene index file using the MPEG format. The simulation results are also presented for several test video sequences to show that these methods have enabled the efficient video database construction and accessing.
This paper describes the general conditions for perfect signal reconstruction in adaptive blocksize MDCT. MDCT, or modified Discrete Cosine Transform, is a method in which blocks are laid to overlap each other. Because of block overlapping, some consideration must be paid to reconstructing the signals perfectly in adaptive blocksize schemes. The perfect reconstruction conditions are derived by considering the reconstruction signals, on a segment by segment basis. These conditions restrict the analysis/synthesis windows in the MDCT formula. Finally, this paper evaluates two examples of window sets, including windows used in the ISO MPEG audio coding standard.
To describe the state of visual communications in the U.S., two words come to mind: digital and anticipation. Although compressed, digital video has been used in teleconferencing systems for at least ten years, it is only recently that a broad consensus has developed among diverse industries anticipating business opportunities, value, or both in digital video. The drivers for this turning point are: advances in digital signal processing, continued improvement in the cost, complexity, and speed of VLSI, maturing international standards and their adoption by vendors and end users, and a seemingly insatiable consumer demand for greater diversity, accessibility, and control of communication systems.