1-4hit |
Shen LI Lingfeng LI Takeshi IKENAGA Shunichi ISHIWATA Masataka MATSUI Satoshi GOTO
The coexistence of MPEG-2 and its powerful successor H.264/AVC has created a huge need for MPEG-2/H.264 video transcoding. However, a traditional transcoder where an MPEG-2 decoder is simply cascaded to an H.264 encoder requires huge computational power due to the adoption of a complicated rate-distortion based mode decision process in H.264. This paper proposes a 2-D Sobel filter based motion vector domain method and a DCT domain method to measure macroblock complexity and realize content-based H.264 candidate mode decision. A new local edge based fast INTRA prediction mode decision method is also adopted to boost the encoding efficiency. Simulation results confirm that with the proposed methods the computational burden of a traditional transcoder can be reduced by 20%30% with only a negligible bit-rate increase for a wide range of video sequences.
Hiroyuki HARA Masataka MATSUI Goichi OTOMO Katsuhiro SETA Takayasu SAKURAI
Special memory and embedded memories used in a newly designed MPEG2 decorder LSI are described. Orthogonal memory, which has a functionality of parallel-to-serial transposition, is employed in a IDCT(Inverse Discrete Cosine Transform) block for small area and low-power. The orthogonal memory realizes the special pupose with 50% of the area and the power compared with using flip-flop array. FIFO's and other dual-port memories are designed by using a single-port RAM operated twice in one clock cycle to reduce cost. Flip-Flop cell is one of the important memory elements in the MPEG environment, and is also improved for the low-cost optimizing functionality for video processing. The area and power of the fabricated MPEG2 decoder chip are reduced by 20% using these techniques. As for testability, direct test mode is implemented for small area. An instruction RAM is placed outside the pad area in parallel to a normal instruction ROM and activated by Al-masterslice for extensive debugging and an early sampling. Other memory related techniques and the key features of the decoder LSI are also described.
Shen LI Takeshi IKENAGA Hideki TAKEDA Masataka MATSUI Satoshi GOTO
Power efficiency and real-time processing capability are two major issues in today's mobile video applications. We proposed a novel Motion Estimation (ME) engine for power-efficient real-time MPEG-4 video coding based on our previously proposed content-based ME algorithm [8],[13]. By adopting Full Search (FS) and Three Step Search (TSS) alternatively according to the nature of video contents, this algorithm keeps the visual quality very close to that of FS with only 3% of its computational power. We designed a flexible Block Matching (BM) Unit with 16-PE SIMD data path so that the adaptive ME can be performed at a much lower clock frequency and hardware cost as compared with previous FS based work. To reduce the energy cost caused by excessive external memory access, on-chip SRAM is also utilized and optimized for parallel processing in the BM Unit. The ME engine is fabricated with TSMC 0.18 µm technology. When processing QCIF (15 fps) video, the estimated power is 2.88 mW@4.16 MHz (supply voltage: 1.62 V). It is believed to be a favorable contribution to the video encoder LSI design for mobile applications.
Yasuo UNEKAWA Tsuguo KOBAYASHI Tsukasa SHIROTORI Yukihiro FUJIMOTO Takayoshi SHIMAZAWA Kazutaka NOGAMI Takehiko NAKAO Kazuhiro SAWADA Masataka MATSUI Takayasu SAKURAI Man Kit TANG William A. HUFFMAN
A 4-way set associative TagRAM with 1.189-Mb capacity has been developed which can handle a secondary cache system of up to 16 Mbytes. A 9-ns cycle operation and clock to Dout of 4.7 ns are achieved by use of circuit techniques such as a pipelined decoding scheme, a single PMOS load BiCMOS main decoder, a BiCMOS sense-amplifying comparator, doubly placed self-timed write circuits, and highly linear VCO for a PLL. The device is successfully implemented with 0.7-µm double polysilicon double-metal BiCMOS technology.