The search functionality is under construction.

Keyword Search Result

[Keyword] motion estimation(131hit)

1-20hit(131hit)

  • High-quality Hardware Integer Motion Estimation for HEVC/H.265 Encoder Open Access

    Chuang ZHU  Jie LIU  Xiao Feng HUANG  Guo Qing XIANG  

     
    BRIEF PAPER-Integrated Electronics

      Pubricized:
    2019/08/13
      Vol:
    E102-C No:12
      Page(s):
    853-856

    This paper reports a high-quality hardware-friendly integer motion estimation (IME) scheme. According to different characteristics of CTU content, the proposed method adopts different adaptive multi-resolution strategies coupled with accurate full-PU modes IME at the finest level. Besides, by using motion vector derivation, IME for the second reference frame is simplified and hardware resource is saved greatly through processing element (PE) sharing. It is shown that the proposed architecture can support the real-time processing of 4K-UHD @60fps, while the BD-rate is just increased by 0.53%.

  • A Micro-Code-Based IME Engine for HEVC and Its Hardware Implementation

    Leilei HUANG  Yibo FAN  Chenhao GU  Xiaoyang ZENG  

     
    PAPER-Integrated Electronics

      Vol:
    E102-C No:10
      Page(s):
    756-765

    High Efficiency Video Coding (HEVC) standard is now becoming one of the most widespread video coding standards in the world. As a successor of H.264 standard, it aims to provide a much superior encoding performance. To fulfill this goal, several new notations along with the corresponding computation processes are introduced by this standard. Among those computation processes, the integer motion estimation (IME) is one of bottlenecks due to the complex partitions of the inter prediction units (PU) and the large search window commonly adopted. Many algorithms have been proposed to address this issue and usually put emphasis on a large search window and great computation amount. However, the coding efforts should be related to the scenes. To be more specific, for relatively static videos, a small search window along with a simple search scheme should be adopted to reduce the time cost and power consumption. In view of this, a micro-code-based IME engine is proposed in this paper, which could be applied with search schemes of different complexity. To test the performance, three different search schemes based on this engine are designed and evaluated under HEVC test model (HM) 16.9, achieving a B-D rate increase of 0.55/-0.07/-0.14%. Compared with our previous work, the hardware implementation is optimized to reduce 64.2% of the SRAMs bits and 32.8% of the logic gate count. The final design could support 4K×2K @139/85/37fps videos @500MHz.

  • Quantized Decoder Adaptively Predicting both Optimum Clock Frequency and Optimum Supply Voltage for a Dynamic Voltage and Frequency Scaling Controlled Multimedia Processor

    Nobuaki KOBAYASHI  Tadayoshi ENOMOTO  

     
    PAPER-Electronic Circuits

      Vol:
    E101-C No:8
      Page(s):
    671-679

    To completely utilize the advantages of dynamic voltage and frequency scaling (DVFS) techniques, a quantized decoder (QNT-D) was developed. The QNT-D generates a quantized signal processing quantity (Q) using a predicted signal processing quantity (M). Q is used to produce the optimum frequency (opt.fc) and the optimum supply voltage (opt.VD) that are proportional to Q. To develop a DVFS controlled motion estimation (ME) processor, we used both the QNT-D and a fast ME algorithm called A2BC (Adaptively Assigned Breaking-off Condition) to predict M for each macro-block (MB). A DVFS controlled ME processor was fabricated using 90-nm CMOS technology. The total power dissipation (PT) of the processor was significantly reduced and varied from 38.65 to 99.5 µW, only 3.27 to 8.41 % of PT of a conventional ME processor, depending on the test video picture.

  • Block-Matching-Based Implementation of Affine Motion Estimation for HEVC

    Chihiro TSUTAKE  Toshiyuki YOSHIDA  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2018/01/15
      Vol:
    E101-D No:4
      Page(s):
    1151-1158

    Many of affine motion compensation techniques proposed thus far employ least-square-based techniques in estimating affine parameters, which requires a hardware structure different from conventional block-matching-based one. This paper proposes a new affine motion estimation/compensation framework friendly to block-matching-based parameter estimation, and applies it to an HEVC encoder to demonstrate its coding efficiency and computation cost. To avoid a nest of search loops, a new affine motion model is first introduced by decomposing the conventional 4-parameter affine model into two 3-parameter ones. Then, a block-matching-based fast parameter estimation technique is proposed for the models. The experimental results given in this paper show that our approach is advantageous over conventional techniques.

  • A 7-Die 3D Stacked 3840×2160@120 fps Motion Estimation Processor

    Shuping ZHANG  Jinjia ZHOU  Dajiang ZHOU  Shinji KIMURA  Satoshi GOTO  

     
    PAPER

      Vol:
    E100-C No:3
      Page(s):
    223-231

    In this paper, a hamburger architecture with a 3D stacked reconfigurable memory is proposed for a 4K motion estimation (ME) processor. By positioning the memory dies on both the top and bottom sides of the processor die, the proposed hamburger architecture can reduce the usage of the signal through-silicon via (TSV), and balance the power delivery network and the clock tree of the entire system. It results in 1/3 reduction of the usage of signal TSVs. Moreover, a stacked reconfigurable memory architecture is proposed to reduce the fabrication complexity and further reduce the number of signal TSVs by more than 1/2. The reduction of signal TSVs in the entire design is 71.24%. Finally, we address unique issues that occur in electronic design automation (EDA) tools during 3D large-scale integration (LSI) designs. As a result, a 4K ME processor with 7-die stacking 3D system-on-chip design is implemented. The proposed design can support real time 3840 × 2160 @ 120 fps encoding at 130 MHz with less than 540 mW.

  • Wavelet Pyramid Based Multi-Resolution Bilateral Motion Estimation for Frame Rate Up-Conversion

    Ran LI  Hongbing LIU  Jie CHEN  Zongliang GAN  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2015/06/03
      Vol:
    E99-D No:1
      Page(s):
    208-218

    The conventional bilateral motion estimation (BME) for motion-compensated frame rate up-conversion (MC-FRUC) can avoid the problem of overlapped areas and holes but usually results in lots of inaccurate motion vectors (MVs) since 1) the MV of an object between the previous and following frames is more likely to have no temporal symmetry with respect to the target block of the interpolated frame and 2) the repetitive patterns existing in video frame lead to the problem of mismatch due to the lack of the interpolated block. In this paper, a new BME algorithm with a low computational complexity is proposed to resolve the above problems. The proposed algorithm incorporates multi-resolution search into BME, since it can easily utilize the MV consistency between two adjacent pyramid levels and spatial neighboring MVs to correct the inaccurate MVs resulting from no temporal symmetry while guaranteeing low computational cost. Besides, the multi-resolution search uses the fast wavelet transform to construct the wavelet pyramid, which not only can guarantee low computational complexity but also can reserve the high-frequency components of image at each level while sub-sampling. The high-frequency components are used to regularize the traditional block matching criterion for reducing the probability of mismatch in BME. Experiments show that the proposed algorithm can significantly improve both the objective and subjective quality of the interpolated frame with low computational complexity, and provide the better performance than the existing BME algorithms.

  • Low-Power Motion Estimation Processor with 3D Stacked Memory

    Shuping ZHANG  Jinjia ZHOU  Dajiang ZHOU  Shinji KIMURA  Satoshi GOTO  

     
    PAPER

      Vol:
    E98-A No:7
      Page(s):
    1431-1441

    Motion estimation (ME) is a key encoding component of almost all modern video coding standards. ME contributes significantly to video coding efficiency, but, it also consumes the most power of any component in a video encoder. In this paper, an ME processor with 3D stacked memory architecture is proposed to reduce memory and core power consumption. First, a memory die is designed and stacked with ME die. By adding face-to-face (F2F) pads and through-silicon-via (TSV) definitions, 2D electronic design automation (EDA) tools can be extended to support the proposed 3D stacking architecture. Moreover, a special memory controller is applied to control data transmission and timing between the memory die and the ME processor die. Finally, a 3D physical design is completed for the entire system. This design includes TSV/F2F placement, floor plan optimization, and power network generation. Compared to 2D technology, the number of input/output (IO) pins is reduced by 77%. After optimizing the floor plan of the processor die and memory die, the routing wire lengths are reduced by 13.4% and 50%, respectively. The stacking static random access memory contributes the most power reduction in this work. The simulation results show that the design can support real-time 720p @ 60fps encoding at 8MHz using less than 65mW in power, which is much better compared to the state-of-the-art ME processor.

  • Frame Rate Up-Conversion Using Median Filter and Motion Estimation with Occlusion Detection

    Dang Ngoc Hai NGUYEN  NamUk KIM  Yung-Lyul LEE  

     
    LETTER-Image

      Vol:
    E98-A No:1
      Page(s):
    455-458

    A new technology for video frame rate up-conversion (FRUC) is presented by combining a median filter and motion estimation (ME) with an occlusion detection (OD) method. First, ME is performed to obtain a motion vector. Then, the OD method is used to refine the MV in the occlusion region. When occlusion occurs, median filtering is applied. Otherwise, bidirectional motion compensated interpolation (BDMC) is applied to create the interpolated frames. The experimental results show that the proposed algorithm provides better performance than the conventional approach. The average gain in the PSNR (Peak Signal to Noise Ratio) is always better than the other methods in the Full HD test sequences.

  • Extraction and Tracking Moving Objects in Detail Considering Visual Feature Constraint and Structure Constraint

    Zhu LI  Yoichi TOMIOKA  Hitoshi KITAZAWA  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E96-D No:5
      Page(s):
    1171-1181

    Detailed tracking is required for many vision applications. A visual feature-based constraint underlies most conventional motion estimation methods. For example, optical flow methods assume that the brightness of each pixel is constant in two consecutive frames. However, it is difficult to realize accurate extraction and tracking using only visual feature information, because viewpoint changes and inconsistent illumination cause the visual features of some regions of objects to appear different in consecutive frames. A structure-based constraint of objects is also necessary for tracking. In the proposed method, both visual feature matching and structure matching are formulated as a linear assignment problem and then integrated.

  • A Low Power Multimedia Processor Implementing Dynamic Voltage and Frequency Scaling Technique and Fast Motion Estimation Algorithm Called “Adaptively Assigned Breaking-Off Condition (A2BC)”

    Tadayoshi ENOMOTO  Nobuaki KOBAYASHI  

     
    PAPER

      Vol:
    E96-C No:4
      Page(s):
    424-432

    A motion estimation (ME) multimedia processor was developed by employing dynamic voltage and frequency scaling (DVFS) technique to greatly reduce the power dissipation. To make full use of the advantages of DVFS technique, a fast motion estimation (ME) algorithm was also developed. It can adaptively predict the optimum supply voltage and the optimum clock frequency before ME process starts for each macro-block for encoding. Power dissipation of the 90-nm CMOS DVFS controlled multimedia processor, which contained an absolute difference accumulator as well as a small on-chip DC/DC level converter, a minimum value detector and DVFS controller, was reduced to 38.48 µW, which was only 3.261% that of a conventional multimedia processor.

  • Surface Modeling-Based Segmentalized Motion Estimation Algorithm for Video Compression

    Junsang CHO  Jung Wook SUH  Gwanggil JEON  Jechang JEONG  

     
    LETTER-Multimedia Systems for Communications

      Vol:
    E96-B No:4
      Page(s):
    1081-1084

    In this letter, we propose an error surface modeling-based segmentalized motion estimation for video coding. We proposed two algorithms previously, one was MBQME [1] and the other is HMBQME [2]. However, these algorithms are not based on locally quadratic MC prediction errors around an integer-pixel motion vector and the hypothesis that the local error plane is a convex function. Therefore, we propose an error surface considered segmentalized modeling algorithm. In this scheme, the tendency of the error surface is first assessed. Using the Sobel operation at the error surface, we classify the error surface region as plain or textured. For plain regions, conventional MBQME is appropriate as the quarter-pixel motion estimation method. For textured regions, we search the additional interpolation points for more accurate modeling. After the interpolation, we perform double precision mathematical modeling so as to find the best motion vector (MV). Experiments show that the proposed scheme has better PSNR performance than conventional modeling algorithms with minimum operation time.

  • Improved B-Picture Coding Scheme for Next Generation Video Compression

    Seung-Jin BAEK  Seung-Won JUNG  Hahyun LEE  Hui Yong KIM  Sung-Jea KO  

     
    PAPER-Image Processing and Video Processing

      Vol:
    E95-D No:9
      Page(s):
    2318-2326

    In this paper, an improved B-picture coding algorithm based on the symmetric bi-directional motion estimation (ME) is proposed. In addition to the block match error between blocks in the forward and backward reference frames, the proposed method exploits the previously-reconstructed template regions in the current and reference frames for bi-directional ME. The side match error between the predicted target block and its template is also employed in order to alleviate block discontinuities. To efficiently perform ME, an initial motion vector (MV) is adaptively derived by exploiting temporal correlations. Experimental results show that the number of generated bits is reduced by up to 9.31% when the proposed algorithm is employed as a new macroblock (MB) coding mode for the H.264/AVC standard.

  • An 88/44 Adaptive Hadamard Transform Based FME VLSI Architecture for 4 K2 K H.264/AVC Encoder

    Yibo FAN  Jialiang LIU  Dexue ZHANG  Xiaoyang ZENG  Xinhua CHEN  

     
    PAPER

      Vol:
    E95-C No:4
      Page(s):
    447-455

    Fidelity Range Extension (FRExt) (i.e. High Profile) was added to the H.264/AVC recommendation in the second version. One of the features included in FRExt is the Adaptive Block-size Transform (ABT). In order to conform to the FRExt, a Fractional Motion Estimation (FME) architecture is proposed to support the 88/44 adaptive Hadamard Transform (88/44 AHT). The 88/44 AHT circuit contributes to higher throughput and encoding performance. In order to increase the utilization of SATD (Sum of Absolute Transformed Difference) Generator (SG) in unit time, the proposed architecture employs two 8-pel interpolators (IP) to time-share one SG. These two IPs can work in turn to provide the available data continuously to the SG, which increases the data throughput and significantly reduces the cycles that are needed to process one Macroblock. Furthermore, this architecture also exploits the linear feature of Hadamard Transform to generate the quarter-pel SATD. This method could help to shorten the long datapath in the second-step of two-iteration FME algorithm. Finally, experimental results show that this architecture could be used in the applications requiring different performances by adjusting the supported modes and operation frequency. It can support the real-time encoding of the seven-mode 4 K2 K@24 fps or six-mode 4 K2 K@30 fps video sequences.

  • Improved Global Motion Estimation Based on Iterative Least-Squares with Adaptive Variable Block Size

    Leiqi ZHU  Dongkai YANG  Qishan ZHANG  

     
    LETTER-Image

      Vol:
    E94-A No:1
      Page(s):
    448-451

    In order to reduce the convergence time in an iterative procedure, some gradient based preliminary processes are employed to eliminate outliers. The adaptive variable block size is also introduced to balance the accuracy and computational complexity. Moreover, the use of Canberra distance instead of Euclidean distance illustrates higher performance in measuring motion similarity.

  • A VGA 30 fps Affine Motion Model Estimation VLSI for Real-Time Video Segmentation

    Yoshiki YUNBE  Masayuki MIYAMA  Yoshio MATSUDA  

     
    PAPER-Computer System

      Vol:
    E93-D No:12
      Page(s):
    3284-3293

    This paper describes an affine motion estimation processor for real-time video segmentation. The processor estimates the dominant motion of a target region with affine parameters. The processor is based on the Pseudo-M-estimator algorithm. Introduction of an image division method and a binary weight method to the original algorithm reduces data traffic and hardware costs. A pixel sampling method is proposed that reduces the clock frequency by 50%. The pixel pipeline architecture and a frame overlap method double throughput. The processor was prototyped on an FPGA; its function and performance were subsequently verified. It was also implemented as an ASIC. The core size is 5.05.0 mm2 in 0.18 µm process, standard cell technology. The ASIC can accommodate a VGA 30 fps video with 120 MHz clock frequency.

  • Architecture and Circuit Optimization of Hardwired Integer Motion Estimation Engine for H.264/AVC

    Zhenyu LIU  Dongsheng WANG  Takeshi IKENAGA  

     
    PAPER-Image Processing

      Vol:
    E93-A No:11
      Page(s):
    2065-2073

    Variable block size motion estimation developed by the latest video coding standard H.264/AVC is the efficient approach to reduce the temporal redundancies. The intensive computational complexity coming from the variable block size technique makes the hardwired accelerator essential, for real-time applications. Propagate partial sums of absolute differences (Propagate Partial SAD) and SAD Tree hardwired engines outperform other counterparts, especially considering the impact of supporting variable block size technique. In this paper, the authors apply the architecture-level and the circuit-level approaches to improve the maximum operating frequency and reduce the hardware overhead of Propagate Partial SAD and SAD Tree, while other metrics, in terms of latency, memory bandwidth and hardware utilization, of the original architectures are maintained. Experiments demonstrate that by using the proposed approaches, at 110.8 MHz operating frequency, compared with the original architectures, 14.7% and 18.0% gate count can be saved for Propagate Partial SAD and SAD Tree, respectively. With TSMC 0.18 µm 1P6M CMOS technology, the proposed Propagate Partial SAD architecture achieves 231.6 MHz operating frequency at a cost of 84.1 k gates. Correspondingly, the maximum work frequency of the optimized SAD Tree architecture is improved to 204.8 MHz, which is almost two times of the original one, while its hardware overhead is merely 88.5 k-gate.

  • An FFT-Based Full-Search Block Matching Algorithm with Sum of Squared Differences Criterion

    Zhen LI  Atushi UEMURA  Hitoshi KIYA  

     
    PAPER-Digital Signal Processing

      Vol:
    E93-A No:10
      Page(s):
    1748-1754

    An FFT-based full-search block matching algorithm (BMA) is described that uses the sum of squared differences (SSD) criterion. The proposed method does not have to extend a real signal into complex one. This reduces the computational load of FFT approaches. In addition, if two macroblocks share the same search window, they can be matched at the same time. In a simulation of motion estimation, the proposed method achieved the same performance as a direct SSD full search and its processing speed is faster than other FFT-based BMAs.

  • Fuzzy-Based Motion Vector Smoothing for Motion Compensated Frame Interpolation

    Vinh TRUONG QUANG  Sung-Hoon HONG  Young-Chul KIM  

     
    LETTER-Image

      Vol:
    E93-A No:8
      Page(s):
    1578-1581

    We proposed a new motion vector (MV) smoothing using fuzzy weighting and vector median filtering for frame rate up-conversion. A fuzzy reasoning system adjusts the weighting values based on the local characteristics of MV field including block difference and block boundary distortion. The fuzzy weighting removes the affect of outliers and irregular MVs from the MV smoothing process. The simulation results show that the proposed algorithm can efficiently correct wrong MVs and thus improve visual quality of the interpolated frames better than conventional methods.

  • Real-Time Estimation of Fast Egomotion with Feature Classification Using Compound Omnidirectional Vision Sensor

    Trung Thanh NGO  Yuichiro KOJIMA  Hajime NAGAHARA  Ryusuke SAGAWA  Yasuhiro MUKAIGAWA  Masahiko YACHIDA  Yasushi YAGI  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E93-D No:1
      Page(s):
    152-166

    For fast egomotion of a camera, computing feature correspondence and motion parameters by global search becomes highly time-consuming. Therefore, the complexity of the estimation needs to be reduced for real-time applications. In this paper, we propose a compound omnidirectional vision sensor and an algorithm for estimating its fast egomotion. The proposed sensor has both multi-baselines and a large field of view (FOV). Our method uses the multi-baseline stereo vision capability to classify feature points as near or far features. After the classification, we can estimate the camera rotation and translation separately by using random sample consensus (RANSAC) to reduce the computational complexity. The large FOV also improves the robustness since the translation and rotation are clearly distinguished. To date, there has been no work on combining multi-baseline stereo with large FOV characteristics for estimation, even though these characteristics are individually are important in improving egomotion estimation. Experiments showed that the proposed method is robust and produces reasonable accuracy in real time for fast motion of the sensor.

  • Video Frame Interpolation by Image Morphing Including Fully Automatic Correspondence Setting

    Miki HASEYAMA  Makoto TAKIZAWA  Takashi YAMAMOTO  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E92-D No:10
      Page(s):
    2163-2166

    In this paper, a new video frame interpolation method based on image morphing for frame rate up-conversion is proposed. In this method, image features are extracted by Scale-Invariant Feature Transform in each frame, and their correspondence in two contiguous frames is then computed separately in foreground and background regions. By using the above two functions, the proposed method accurately generates interpolation frames and thus achieves frame rate up-conversion.

1-20hit(131hit)