The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] h.264(137hit)

101-120hit(137hit)

  • Fast Intra-Mode Prediction Algorithm in H.264/AVC Video Coding

    Jong-Ho KIM  Byung-Gyu KIM  Chang-Sik CHO  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E90-D No:8
      Page(s):
    1320-1323

    A fast intra-mode decision algorithm is proposed on the basis of an inter-mode block type for inter-frames (P-slices). Each macroblock (MB) type has its own intra prediction modes (I16MB and 88 chroma: 4 modes, I4MB and I8MB: 9 modes). This procedure creates a large computational complexity in addition to the inter mode decision procedure. In most cases, there is a high correlation between the best inter-mode block type and the direction of the texture edge or object boundary. Therefore, only a small number of intra-prediction modes are chosen to determine the best intra mode based on this correlation. We experimentally verify that the proposed scheme can significantly improve the speed of the overall encoding time with a negligible loss of image quality and a minimal bit increase. The average loss in PSNR was -0.0120.036 dB and the bit increment was approximately -0.1940.751%.

  • Efficient Motion Estimation for H.264 Codec by Using Effective Scan Ordering

    Jeongae PARK  Misun YOON  Hyunchul SHIN  

     
    LETTER-Devices/Circuits for Communications

      Vol:
    E90-B No:7
      Page(s):
    1839-1843

    Motion estimation (ME) is a computation intensive procedure in H.264. In ME for variable block sizes, an effective scan ordering method has been devised for early termination of absolute difference computation when the termination does not affect the performance. The new ME circuit with effective scan ordering can reduce the amount of computation by 70% compared to JM8.2 and by 30% compared to the disable approximation unit (DAU) approach.

  • Adaptive Scanning Using Pixel Similarity for H.264/AVC

    Dae-Yeon KIM  Dong-Kyun KIM  Yung-Lyul LEE  

     
    LETTER-Image

      Vol:
    E90-A No:5
      Page(s):
    1112-1114

    In H.264/AVC, the quantized coefficients are scanned in a zigzag pattern. But the zigzag scanning is not always efficient for the directional spatial predictions in the intra coding of H.264/AVC. In this letter, we propose an adaptive scanning using the pixel similarity of the neighboring pixels to achieve enhanced intra coding performance. The proposed method reduces the bit rate approximately 2% compared with H.264/AVC without video quality degradation.

  • Frame-Level ρ-Domain R-D Optimization in H.264

    Yutao DONG  Xiangzhong FANG  Jing YANG  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E90-D No:5
      Page(s):
    872-876

    The frame-level R-D optimization in H.264 is very important in video storage scenarios. Among all of the sub-optimal algorithms, a greedy iteration algorithm (GIA) can best lower the computational complexity of frame-level R-D optimization. In order to further lower the computational complexity, a ρ-domain frame-level R-D optimization algorithm is proposed in this letter. Different from GIA, every frame's rate and distortion can be estimated accurately without actual encoding in our proposed algorithm. Simulation results show that our proposed algorithm can lower the computational complexity greatly with negligible variation in peak signal-to-noise ratio (PSNR) compared with GIA.

  • Lossless VLSI Oriented Full Computation Reusing Algorithm for H.264/AVC Fractional Motion Estimation

    Ming SHAO  Zhenyu LIU  Satoshi GOTO  Takeshi IKENAGA  

     
    PAPER

      Vol:
    E90-A No:4
      Page(s):
    756-763

    Fractional Motion Estimation (FME) is an advanced feature adopted in H.264/AVC video compression standard with quarter-pixel accuracy. Although FME could gain considerably higher encoding efficiency, sub-pixel interpolation and sum of absolute transformed difference (SATD) computation, as main parts of FME, increase the computation complexity a lot. To reduce the complexity of FME, this paper proposes a full computation reusable VLSI oriented algorithm. Through exploiting the similarity among motion vectors (MVs) of partitions in the same macroblock (MB), temporary computation results can be fully reused. Furthermore, a simple and effective searching method is adopted to make the proposed method more suitable for VLSI implementation. Experiment results show that up to 80% add operations and 85% internal reference frame memory access operations are saved without any degradation in the coding quality.

  • Implementations of Low-Cost Hardware Sharing Architectures for Fast 88 and 44 Integer Transforms in H.264/AVC

    Chih-Peng FAN  Yu-Lian LIN  

     
    LETTER-Digital Signal Processing

      Vol:
    E90-A No:2
      Page(s):
    511-516

    In this paper, novel hardware sharing architectures are proposed for realizations of fast 44 and 88 forward/inverse integer transforms in H.264/AVC applications. Based on matrix factorizations, the cost-effective architectures for fast one-dimensional (1-D) 44 and 88 forward/inverse integer transforms can be derived through the Kronecker and direct sum operations. By applying the concept of hardware sharing, the proposed hardware schemes for fast integer transforms need a smaller number of shifters and adders than the direct realization architecture, where the direct architecture just implements the individual 44 and individual 88 integer transforms independently. With low hardware cost and regular modularity, the proposed hardware sharing architectures can process up to 125 MHz with the cost-effective area and are suitable for VLSI implementations to accomplish the H.264/AVC signal processing.

  • Content-Based Complexity Reduction Methods for MPEG-2 to H.264 Transcoding

    Shen LI  Lingfeng LI  Takeshi IKENAGA  Shunichi ISHIWATA  Masataka MATSUI  Satoshi GOTO  

     
    PAPER

      Vol:
    E90-D No:1
      Page(s):
    90-98

    The coexistence of MPEG-2 and its powerful successor H.264/AVC has created a huge need for MPEG-2/H.264 video transcoding. However, a traditional transcoder where an MPEG-2 decoder is simply cascaded to an H.264 encoder requires huge computational power due to the adoption of a complicated rate-distortion based mode decision process in H.264. This paper proposes a 2-D Sobel filter based motion vector domain method and a DCT domain method to measure macroblock complexity and realize content-based H.264 candidate mode decision. A new local edge based fast INTRA prediction mode decision method is also adopted to boost the encoding efficiency. Simulation results confirm that with the proposed methods the computational burden of a traditional transcoder can be reduced by 20%30% with only a negligible bit-rate increase for a wide range of video sequences.

  • An Efficient Pipeline Architecture for Deblocking Filter in H.264/AVC

    Chung-Ming CHEN  Chung-Ho CHEN  

     
    PAPER

      Vol:
    E90-D No:1
      Page(s):
    99-107

    In this paper, we study and analyze the computational complexity of deblocking filter in H.264/AVC baseline decoder based on SimpleScalar/ARM simulator. The simulation result shows that the memory reference, content activity check operations, and filter operations are known to be very time consuming in the decoder of this new video coding standard. In order to improve overall system performance, we propose a novel processing order with efficient VLSI architecture which simultaneously processes the horizontal filtering of vertical edge and vertical filtering of horizontal edge. As a result, the memory performance of the proposed architecture is improved by four times when compared to the software implementation. Moreover, the system performance of our design significantly outperforms the previous proposals.

  • A Fine-Grain Scalable and Low Memory Cost Variable Block Size Motion Estimation Architecture for H.264/AVC

    Zhenyu LIU  Yang SONG  Takeshi IKENAGA  Satoshi GOTO  

     
    PAPER-Integrated Electronics

      Vol:
    E89-C No:12
      Page(s):
    1928-1936

    One full search variable block size motion estimation (VBSME) architecture with integer pixel accuracy is proposed in this paper. This proposed architecture has following features: (1) Through widening data path from the search area memories, m processing element groups (PEG) could be scheduled to work in parallel and fully utilized, where m is a factor of sixteen. Each PEG has sixteen processing elements (PE) and just costs 8.5K gates. This feature provides users more flexibility to make tradeoff between the hardware cost and the performance. (2) Based on pipelining and multi-cycle data path techniques, this architecture can work at high clock frequency. (3) The memory partition number is greatly reduced. When sixteen PEGs are adopted, only two memory partitions are required for the search area data storage. Therefore, both the system hardware cost and power consumption can be saved. A 16-PEG design with 4832 search range has been implemented with TSMC 0.18 µm CMOS technology. In typical work conditions, its maximum clock frequency is 261 MHz. Compared with the previous 2-D architecture [9], about 13.4% hardware cost and 5.7% power consumption can be saved.

  • Fast 2-Dimensional 88 Integer Transform Algorithm Design for H.264/AVC Fidelity Range Extensions

    Chih-Peng FAN  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E89-D No:12
      Page(s):
    3006-3011

    In this letter, efficient two-dimensional (2-D) fast algorithms for realizations of 88 forward and inverse integer transforms in H.264/AVC fidelity range extensions (FRExt) are proposed. Based on matrix factorizations with Kronecker product and direct sum operations, efficient fast 2-D 88 forward and inverse integer transforms can be derived from the one-dimensional (1-D) fast 88 forward and inverse integer transforms through matrix operations. The proposed fast 2-D 88 forward and inverse integer transform designs don't require transpose memory in hardware realizations. The fast 2-D 88 integer transforms require fewer latency delays and provide a larger throughput rate than the row-column based method. With regular modularity, the proposed fast algorithms are suitable for VLSI implementations to achieve H.264/AVC FRExt high-profile signal processing.

  • A 50% Power Reduction in H.264/AVC HDTV Video Decoder LSI by Dynamic Voltage Scaling in Elastic Pipeline

    Kentaro KAWAKAMI  Jun TAKEMURA  Mitsuhiko KURODA  Hiroshi KAWAGUCHI  Masahiko YOSHIMOTO  

     
    PAPER-VLSI Architecture

      Vol:
    E89-A No:12
      Page(s):
    3642-3651

    We propose an elastic pipeline that can apply dynamic voltage scaling (DVS) to hardwired logic circuits. In order to demonstrate its feasibility, a hardwired H.264/AVC HDTV decoder is designed as a real-time application. An entropy decoding process is divided into context-based adaptive binary arithmetic coding (CABAC) and syntax element decoding (SED), which has advantages of smoothing workload for CABAC and keeping efficiency of the elastic pipeline. An operating frequency and supply voltage are dynamically modulated every slot depending on workload of H.264 decoding to minimize power. We optimize the number of slots per frame to enhance power reduction. The proposed decoder achieves a power reduction of 50% in a 90-nm process technology, compared to the conventional clock-gating scheme.

  • A Low-Cost CAVLC Encoder

    Pei-Yin CHEN  Yi-Ming LIN  

     
    LETTER-Electronic Circuits

      Vol:
    E89-C No:12
      Page(s):
    1950-1953

    In H.264, the context-based adaptive variable length coding (CAVLC) is used for lossless compression. Direct table-lookup implementation requires higher cost because it employs a larger memory to produce the encoded results. In this letter, we present a more efficient technique for CAVLC implementation. Compared with those previous CAVLC chips, our design requires the lowest hardware cost.

  • A VLSI Architecture for Variable Block Size Motion Estimation in H.264/AVC with Low Cost Memory Organization

    Yang SONG  Zhenyu LIU  Takeshi IKENAGA  Satoshi GOTO  

     
    PAPER-VLSI Architecture

      Vol:
    E89-A No:12
      Page(s):
    3594-3601

    A one-dimensional (1-D) full search variable block size motion estimation (VBSME) architecture is presented in this paper. By properly choosing the partial sum of absolute differences (SAD) registers and scheduling the addition operations, the architecture can be implemented with simple control logic and regular workflow. Moreover, only one single-port SRAM is used to store the search area data. The design is realized in TSMC 0.18 µm 1P6M technology with a hardware cost of 67.6K gates. In typical working conditions (1.8 V, 25), a clock frequency of 266 MHz can be achieved.

  • A Sub-mW H.264 Baseline-Profile Motion Estimation Processor Core with a VLSI-Oriented Block Partitioning Strategy and SIMD/Systolic-Array Architecture

    Junichi MIYAKOSHI  Yuichiro MURACHI  Tetsuro MATSUNO  Masaki HAMAMOTO  Takahiro IINUMA  Tomokazu ISHIHARA  Hiroshi KAWAGUCHI  Masayuki MIYAMA  Masahiko YOSHIMOTO  

     
    PAPER-VLSI Architecture

      Vol:
    E89-A No:12
      Page(s):
    3623-3633

    We propose a sub-mW H.264 baseline-profile motion estimation processor for portable video applications. It features a VLSI-oriented block partitioning strategy and low-power SIMD/systolic-array datapath architecture, where the datapath can be switched between an SIMD and systolic array depending on processing flow. The processor supports all the seven kinds of block modes, and can handle three reference frames for a CIF (352288) 30-fps to QCIF (176144) 15-fps sequences with a quarter-pixel accuracy. It integrates 3.3 million transistors, and occupies 2.83.1 mm2 in a 130-nm CMOS technology. The proposed processor achieves a power of 800 µW in a QCIF 15-fps sequence with one reference picture.

  • A Power- and Area-Efficient SRAM Core Architecture with Segmentation-Free and Horizontal/Vertical Accessibility for Super-Parallel Video Processing

    Junichi MIYAKOSHI  Yuichiro MURACHI  Tomokazu ISHIHARA  Hiroshi KAWAGUCHI  Masahiko YOSHIMOTO  

     
    PAPER

      Vol:
    E89-C No:11
      Page(s):
    1629-1636

    For super-parallel video processing, we proposed a power- and area-efficient SRAM core architecture with a segmentation-free access, which means accessibility to arbitrary consecutive pixels, and horizontal/vertical access. To achieve these flexible accesses, a spirally-connected local-wordline select signal and multi-selection scheme in wordlines are proposed, so that extra X-decoders in the conventional multi-division SRAM can be eliminated. Consequently, the proposed SRAM reduces a power and area by 57-60% and 60%, respectively, when it is applied to a 128 parallel architecture. The proposed 160-kbit SRAM with 16-read ports (2-read port SRAM with eight-parallel architecture) is implemented to a search window buffer for an H.264 motion estimation processor core which dissipates 800 µW for QCIF 15-fps in a 130-nm technology.

  • Fast Variable Block-Size Motion Estimation by Merging Refined Motion Vector for H.264

    Mei-Juan CHEN  Kai-Chung HOU  

     
    PAPER-Multimedia Systems for Communications

      Vol:
    E89-B No:10
      Page(s):
    2922-2928

    This paper proposes a fast motion estimation algorithm for variable block-sizes by utilizing motion vector bottom-up procedure for H.264. The refined motion vectors of adjacent small blocks are merged to predict the motion vectors of larger blocks for reducing the computation. Experimental results show that our proposed method has lower computational complexity than full search, fast full search and fast motion estimation of the H.264 reference software JM93 with slight quality decrease and little bit-rate increase.

  • Temporal Error Concealment for H.264 Video Based on Adaptive Block-Size Pixel Replacement

    Donghyung KIM  Jongho KIM  Jechang JEONG  

     
    LETTER-Multimedia Systems for Communications

      Vol:
    E89-B No:7
      Page(s):
    2111-2114

    The H.264 standard allows each macroblock to have up to sixteen motion vectors, four reference frames, and a macroblock mode. Exploiting this feature, we present an efficient temporal error concealment algorithm for H.264-coded video. The proposed method turns out to show good performance compared with conventional approaches.

  • An Efficient Architecture of High-Performance Deblocking Filter for H.264/AVC

    Seonyoung LEE  Kyeongsoon CHO  

     
    LETTER

      Vol:
    E89-A No:6
      Page(s):
    1736-1739

    We devised an efficient architecture of deblocking filter and implemented the circuit with 15,400 logic gates and a 16032 dual-port SRAM using 0.25 µm standard cell technology. This circuit can process 88 image frames with 1,280720 pixels per second at 166 MHz. Our circuit requires smaller number of accesses to the external memory than other approaches and hence causes less bus traffic in the SoC design platform.

  • Polyphase Downsampling Based Multiple Description Coding Applied to H.264 Video Coding

    Jie JIA  Hae-Kwang KIM  

     
    PAPER

      Vol:
    E89-A No:6
      Page(s):
    1601-1606

    This paper presents a video coding method that improves error resilient functionality of H.264 with good coding efficiency. The method is based on PD (polyphase downsampling) multiple description coding. The only changes to H.264 are inserting PD before the DCT process and having new data partitioning NAL units. A coded slice is sent on 3 data partitioning NAL units. A header NAL unit contains motion vectors and block modes. Each of the other two NAL units contains a description generated by PD multiple description coding. The experimental results on all 9 of the test sequences of JVT SVC show that the proposed method gives 0.5 to 5 dB enhancement over the existing H.264 FMO checker board mode with motion vector based error-concealment.

  • Hardware Architecture for Fast Motion Estimation in H.264/AVC Video Coding

    Myung-Suk BYEON  Yil-Mi SHIN  Yong-Beom CHO  

     
    LETTER

      Vol:
    E89-A No:6
      Page(s):
    1744-1745

    This paper describes the efficiency of VLSI architecture for UMHexagonS (hybrid Unsymmetrical cross Multi Hexagon grid Search) matching algorithm. This algorithm is used for ME (Motion Estimation) of H.264/AVC video compression standard. The UMHexagonS is called a hybrid algorithm since it uses different kinds of searching patterns. VLSI architecture based on UMHexagonS is designed to provide a good tradeoff between gate sizes and high throughput. We implemented this architecture with about 309 K gates and 1/1792 throughput [block/cycle] for a search range of 16 and 44 macro blocks using synthesizable Verilog HDL.

101-120hit(137hit)