The search functionality is under construction.

Keyword Search Result

[Keyword] video coding(141hit)

1-20hit(141hit)

  • A VVC Dependent Quantization Optimization Based on the Parallel Viterbi Algorithm and Its FPGA Implementation Open Access

    Qinghua SHENG  Yu CHENG  Xiaofang HUANG  Changcai LAI  Xiaofeng HUANG  Haibin YIN  

     
    PAPER-Computer System

      Pubricized:
    2024/03/04
      Vol:
    E107-D No:7
      Page(s):
    797-806

    Dependent Quantization (DQ) is a new quantization tool introduced in the Versatile Video Coding (VVC) standard. While it provides better rate-distortion calculation accuracy, it also increases the computational complexity and hardware cost compared to the widely used scalar quantization. To address this issue, this paper proposes a parallel-dependent quantization hardware architecture using Verilog HDL language. The architecture preprocesses the coefficients with a scalar quantizer and a high-frequency filter, and then further segments and processes the coefficients in parallel using the Viterbi algorithm. Additionally, the weight bit width of the rate-distortion calculation is reduced to decrease the quantization cycle and computational complexity. Finally, the final quantization of the TU is determined through sequential scanning and judging of the rate-distortion cost. Experimental results show that the proposed algorithm reduces the quantization cycle by an average of 56.96% compared to VVC’s reference platform VTM, with a Bjøntegaard delta bit rate (BDBR) loss of 1.03% and 1.05% under the Low-delay P and Random Access configurations, respectively. Verification on the AMD FPGA development platform demonstrates that the hardware implementation meets the quantization requirements for 1080P@60Hz video hardware encoding.

  • Neural Network-Based Post-Processing Filter on V-PCC Attribute Frames

    Keiichiro TAKADA  Yasuaki TOKUMO  Tomohiro IKAI  Takeshi CHUJOH  

     
    LETTER

      Pubricized:
    2023/07/13
      Vol:
    E106-D No:10
      Page(s):
    1673-1676

    Video-based point cloud compression (V-PCC) utilizes video compression technology to efficiently encode dense point clouds providing state-of-the-art compression performance with a relatively small computation burden. V-PCC converts 3-dimensional point cloud data into three types of 2-dimensional frames, i.e., occupancy, geometry, and attribute frames, and encodes them via video compression. On the other hand, the quality of these frames may be degraded due to video compression. This paper proposes an adaptive neural network-based post-processing filter on attribute frames to alleviate the degradation problem. Furthermore, a novel training method using occupancy frames is studied. The experimental results show average BD-rate gains of 3.0%, 29.3% and 22.2% for Y, U and V respectively.

  • An Efficient Reference Image Sharing Method for the Image-Division Parallel Video Encoding Architecture

    Ken NAKAMURA  Yuya OMORI  Daisuke KOBAYASHI  Koyo NITTA  Kimikazu SANO  Masayuki SATO  Hiroe IWASAKI  Hiroaki KOBAYASHI  

     
    PAPER

      Pubricized:
    2022/11/29
      Vol:
    E106-C No:6
      Page(s):
    312-320

    This paper proposes an efficient reference image sharing method for the image-division parallel video encoding architecture. This method efficiently reduces the amount of data transfer by using pre-transfer with area prediction and on-demand transfer with a transfer management table. Experimental results show that the data transfer can be reduced to 19.8-35.3% of the conventional method on average without major degradation of coding performance. This makes it possible to reduce the required bandwidth of the inter-chip transfer interface by saving the amount of data transfer.

  • Wider Depth Dynamic Range Using Occupancy Map Correction for Immersive Video Coding

    Sung-Gyun LIM  Dong-Ha KIM  Kwan-Jung OH  Gwangsoon LEE  Jun Young JEONG  Jae-Gon KIM  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2023/02/10
      Vol:
    E106-D No:5
      Page(s):
    1102-1105

    The MPEG Immersive Video (MIV) standard for immersive video coding provides users with an immersive sense of 6 degrees of freedom (6DoF) of view position and orientation by efficiently compressing multiview video acquired from different positions in a limited 3D space. In the MIV reference software called Test Model for Immersive Video (TMIV), the number of pixels to be compressed and transmitted is reduced by removing inter-view redundancy. Therefore, the occupancy information that indicates whether each pixel is valid or invalid must also be transmitted to the decoder for viewport rendering. The occupancy information is embedded in a geometry atlas and transmitted to the decoder side. At this time, to prevent occupancy errors that may occur during the compression of the geometry atlas, a guard band is set in the depth dynamic range. Reducing this guard band can improve the rendering quality by allowing a wider dynamic range for depth representation. Therefore, in this paper, based on the analysis of occupancy error of the current TMIV, two methods of occupancy error correction which allow depth dynamic range extension in the case of computer-generated (CG) sequences are presented. The experimental results show that the proposed method gives an average 2.2% BD-rate bit saving for CG compared to the existing TMIV.

  • Lookahead Search-Based Low-Complexity Multi-Type Tree Pruning Method for Versatile Video Coding (VVC) Intra Coding

    Qi TENG  Guowei TENG  Xiang LI  Ran MA  Ping AN  Zhenglong YANG  

     
    PAPER-Coding Theory

      Pubricized:
    2022/08/24
      Vol:
    E106-A No:3
      Page(s):
    606-615

    The latest versatile video coding (VVC) introduces some novel techniques such as quadtree with nested multi-type tree (QTMT), multiple transform selection (MTS) and multiple reference line (MRL). These tools improve compression efficiency compared with the previous standard H.265/HEVC, but they suffer from very high computational complexity. One of the most time-consuming parts of VVC intra coding is the coding tree unit (CTU) structure decision. In this paper, we propose a low-complexity multi-type tree (MT) pruning method for VVC intra coding. This method consists of lookahead search and MT pruning. The lookahead search process is performed to derive the approximate rate-distortion (RD) cost of each MT node at depth 2 or 3. Subsequently, the improbable MT nodes are pruned by different strategies under different cost errors. These strategies are designed according to the priority of the node. Experimental results show that the overall proposed algorithm can achieve 47.15% time saving with only 0.93% Bjøntegaard delta bit rate (BDBR) increase over natural scene sequences, and 45.39% time saving with 1.55% BDBR increase over screen content sequences, compared with the VVC reference software VTM 10.0. Such results demonstrate that our method achieves a good trade-off between computational complexity and compression quality compared to recent methods.

  • Geometric Partitioning Mode with Inter and Intra Prediction for Beyond Versatile Video Coding

    Yoshitaka KIDANI  Haruhisa KATO  Kei KAWAMURA  Hiroshi WATANABE  

     
    PAPER

      Pubricized:
    2022/06/21
      Vol:
    E105-D No:10
      Page(s):
    1691-1703

    Geometric partitioning mode (GPM) is a new inter prediction tool adopted in versatile video coding (VVC), which is the latest video coding of international standard developed by joint video expert team in 2020. Different from the regular inter prediction performed on rectangular blocks, GPM separates a coding block into two regions by the pre-defined 64 types of straight lines, generates inter predicted samples for each separated region, and then blends them to obtain the final inter predicted samples. With this feature, GPM improves the prediction accuracy at the boundary between the foreground and background with different motions. However, GPM has room to further improve the prediction accuracy if the final predicted samples can be generated using not only inter prediction but also intra prediction. In this paper, we propose a GPM with inter and intra prediction to achieve further enhanced compression capability beyond VVC. To maximize the coding performance of the proposed method, we also propose the restriction of the applicable intra prediction mode number and the prohibition of applying the intra prediction to both GPM-separated regions. The experimental results show that the proposed method improves the coding performance gain by the conventional GPM method of VVC by 1.3 times, and provides an additional coding performance gain of 1% bitrate savings in one of the coding structures for low-latency video transmission where the conventional GPM method cannot be utilized.

  • A Failsoft Scheme for Mobile Live Streaming by Scalable Video Coding

    Hiroki OKADA  Masato YOSHIMI  Celimuge WU  Tsutomu YOSHINAGA  

     
    PAPER

      Pubricized:
    2021/09/08
      Vol:
    E104-D No:12
      Page(s):
    2121-2130

    In this study, we propose a mechanism called adaptive failsoft control to address peak traffic in mobile live streaming, using a chasing playback function. Although a cache system is avaliable to support the chasing playback function for live streaming in a base station and device-to-device communication, the request concentration by highlight scenes influences the traffic load owing to data unavailability. To avoid data unavailability, we adapted two live streaming features: (1) streaming data while switching the video quality, and (2) time variability of the number of requests. The second feature enables a fallback mechanism for the cache system by prioritizing cache eviction and terminating the transfer of cache-missed requests. This paper discusses the simulation results of the proposed mechanism, which adopts a request model appropriate for (a) avoiding peak traffic and (b) maintaining continuity of service.

  • 3D-HEVC Virtual View Synthesis Based on a Reconfigurable Architecture

    Lin JIANG  Xin WU  Yun ZHU  Yu WANG  

     
    PAPER-Multimedia Systems for Communications

      Pubricized:
    2019/11/12
      Vol:
    E103-B No:5
      Page(s):
    618-626

    For high definition (HD) videos, the 3D-High Efficiency Video Coding (3D-HEVC) reference algorithm incurs dramatically highly computation loads. Therefore, with the demands for the real-time processing of HD video, a hardware implementation is necessary. In this paper, a reconfigurable architecture is proposed that can support both median filtering preprocessing and mean filtering preprocessing to satisfy different scene depth maps. The architecture sends different instructions to the corresponding processing elements according to different scenarios. Mean filter is used to process near-range images, and median filter is used to process long-range images. The simulation results show that the designed architecture achieves an averaged PSNR of 34.55dB for the tested images. The hardware design for the proposed virtual view synthesis system operates at a maximum clock frequency of 160MHz on the BEE4 platform which is equipped with four Virtex-6 FF1759 LX550T Field-Programmable Gate Array (FPGA) for outputting 720p (1024×768) video at 124fps.

  • Temporal Domain Difference Based Secondary Background Modeling Algorithm

    Guowei TENG  Hao LI  Zhenglong YANG  

     
    LETTER-Communication Theory and Signals

      Vol:
    E103-A No:2
      Page(s):
    571-575

    This paper proposes a temporal domain difference based secondary background modeling algorithm for surveillance video coding. The proposed algorithm has three key technical contributions as following. Firstly, the LDBCBR (Long Distance Block Composed Background Reference) algorithm is proposed, which exploits IBBS (interval of background blocks searching) to weaken the temporal correlation of the foreground. Secondly, both BCBR (Block Composed Background Reference) and LDBCBR are exploited at the same time to generate the temporary background reference frame. The secondary modeling algorithm utilizes the temporary background blocks generated by BCBR and LDBCBR to get the final background frame. Thirdly, monitor the background reference frame after it is generated is also important. We would update the background blocks immediately when it has a big change, shorten the modeling period of the areas where foreground moves frequently and check the stable background regularly. The proposed algorithm is implemented in the platform of IEEE1857 and the experimental results demonstrate that it has significant improvement in coding efficiency. In surveillance test sequences recommended by the China AVS (Advanced Audio Video Standard) working group, our method achieve BD-Rate gain by 6.81% and 27.30% comparing with BCBR and the baseline profile.

  • Simplified Triangular Partitioning Mode in Versatile Video Coding

    Dohyeon PARK  Jinho LEE  Jung-Won KANG  Jae-Gon KIM  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2019/10/30
      Vol:
    E103-D No:2
      Page(s):
    472-475

    The emerging Versatile Video Coding (VVC) standard currently adopts Triangular Partitioning Mode (TPM) to make more flexible inter prediction. Due to the motion search and motion storage for TPM, the complexity of the encoder and decoder is significantly increased. This letter proposes two simplifications of TPM for reducing the complexity of the current design. One simplification is to reduce the number of combinations of motion vectors for both partitions to be checked. The method gives 4% encoding time decrease with negligible BD-rate loss. Another one is to remove the reference picture remapping process in the motion vector storage of TPM. It reduces the complexity of the encoder and decoder without a BD-rate change for the random-access configuration.

  • Precoder and Postcoder Design for Wireless Video Streaming with Overloaded Multiuser MIMO-OFDM Systems

    Koji TASHIRO  Masayuki KUROSAKI  Hiroshi OCHI  

     
    PAPER-Digital Signal Processing

      Vol:
    E102-A No:12
      Page(s):
    1825-1833

    Mobile video traffic is expected to increase explosively because of the proliferating number of Wi-Fi terminals. An overloaded multiple-input multiple-output (MIMO) technique allows the receiver to implement smaller number of antennas than the transmitter in exchange for degradation in video quality and a large amount of computational complexity for postcoding at the receiver side. This paper proposes a novel linear precoder for high-quality video streaming in overloaded multiuser MIMO systems, which protects visually significant portions of a video stream. A low complexity postcoder is also proposed, which detects some of data symbols by linear detection and the others by a prevoting vector cancellation (PVC) approach. It is shown from simulation results that the combination use of the proposed precoder and postcoder achieves higher-quality video streaming to multiple users in a wider range of signal-to-noise ratio (SNR) than a conventional unequal error protection scheme. The proposed precoder attains 40dB in peak signal-to-noise ratio even in poor channel conditions such as the SNR of 12dB. In addition, due to the stepwise acquisition of data symbols by means of linear detection and PVC, the proposed postcoder reduces the number of complex additions by 76% and that of multiplications by 64% compared to the conventional PVC.

  • A Micro-Code-Based IME Engine for HEVC and Its Hardware Implementation

    Leilei HUANG  Yibo FAN  Chenhao GU  Xiaoyang ZENG  

     
    PAPER-Integrated Electronics

      Vol:
    E102-C No:10
      Page(s):
    756-765

    High Efficiency Video Coding (HEVC) standard is now becoming one of the most widespread video coding standards in the world. As a successor of H.264 standard, it aims to provide a much superior encoding performance. To fulfill this goal, several new notations along with the corresponding computation processes are introduced by this standard. Among those computation processes, the integer motion estimation (IME) is one of bottlenecks due to the complex partitions of the inter prediction units (PU) and the large search window commonly adopted. Many algorithms have been proposed to address this issue and usually put emphasis on a large search window and great computation amount. However, the coding efforts should be related to the scenes. To be more specific, for relatively static videos, a small search window along with a simple search scheme should be adopted to reduce the time cost and power consumption. In view of this, a micro-code-based IME engine is proposed in this paper, which could be applied with search schemes of different complexity. To test the performance, three different search schemes based on this engine are designed and evaluated under HEVC test model (HM) 16.9, achieving a B-D rate increase of 0.55/-0.07/-0.14%. Compared with our previous work, the hardware implementation is optimized to reduce 64.2% of the SRAMs bits and 32.8% of the logic gate count. The final design could support 4K×2K @139/85/37fps videos @500MHz.

  • A Quality-Level Selection for Adaptive Video Streaming with Scalable Video Coding

    Shungo MORI  Masaki BANDAI  

     
    PAPER-Network

      Pubricized:
    2018/10/22
      Vol:
    E102-B No:4
      Page(s):
    824-831

    In this paper, we propose a quality-level selection method for adaptive video streaming with scalable video coding (SVC). The proposed method works on the client with the dynamic adaptive streaming over HTTP (DASH) with SVC. The proposed method consists of two components: introducing segment group and a buffer-aware layer selection algorithm. In general, quality of experience (QoE) performance degrades due to stalling (playback buffer underflow), low playback quality, frequent quality-level switching, and extreme-down quality switching. The proposed algorithm focuses on reducing the frequent quality-level switching, and extreme-down quality switching without increasing stalling and degrading playback quality. In the proposed method, a SVC-DASH client selects a layer every G segments, called a segment group to prevent frequent quality-level switching. In addition, the proposed method selects the quality of a layer based on a playback buffer in a layer selection algorithm for preventing extreme-down switching. We implement the proposed method on a real SVC-DASH system and evaluate its performance by subjective evaluations of multiple users. As a result, we confirm that the proposed algorithm can obtain better mean opinion score (MOS) value than a conventional SVC-DASH, and confirm that the proposed algorithm is effective to improve QoE performance in SVC-DASH.

  • A Rate Perceptual-Distortion Optimized Video Coding HEVC

    Bumshik LEE  Jae Young CHOI  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2018/08/24
      Vol:
    E101-D No:12
      Page(s):
    3158-3169

    In this paper, a perceptual distortion based rate-distortion optimized video coding scheme for High Efficiency Video Coding (HEVC) is proposed. Structural Similarity Index (SSIM) in transform domain, which is known as distortion metric to better reflect human's perception, is derived for the perceptual distortion model to be applied for hierarchical coding block structure of HEVC. A SSIM-quantization model is proposed using the properties of DCT and high resolution quantization assumption. The SSIM model is obtained as the sum of SSIM in each Coding Unit (CU) depth of HEVC, which precisely predict SSIM values for the hierarchical quadtree structure of CU in HEVC. The rate model is derived from the entropy, based on Laplacian distributions of transform residual coefficients and is jointly combined with the SSIM-based distortion model for rate-distortion optimization in an HEVC video codec and can be compliantly applied to HEVC. The experimental results demonstrate that the proposed method achieves 8.1% and 4.0% average bit rate reductions in rate-SSIM performance for low-delay and random access configurations respectively, outperforming other existing methods. The proposed method provides better visual quality than the conventional mean square error (MSE)-based RDO coding scheme.

  • Optimal Design of Adaptive Intra Predictors Based on Sparsity Constraint

    Yukihiro BANDOH  Yuichi SAYAMA  Seishi TAKAMURA  Atsushi SHIMIZU  

     
    PAPER-Image

      Vol:
    E101-A No:11
      Page(s):
    1795-1805

    It is essential to improve intra prediction performance to raise the efficiency of video coding. In video coding standards such as H.265/HEVC, intra prediction is seen as an extension of directional prediction schemes, examples include refinement of directions, planar extension, filtering reference sampling, and so on. From the view point of reducing prediction error, some improvements on intra prediction for standardized schemes have been suggested. However, on the assumption that the correlation between neighboring pixels are static, these conventional methods use pre-defined predictors regardless of the image being encoded. Therefore, these conventional methods cannot reduce prediction error if the images break the assumption made in prediction design. On the other hand, adaptive predictors that change the image being encoded may offer poor coding efficiency due to the overhead of the additional information needed for adaptivity. This paper proposes an adaptive intra prediction scheme that resolves the trade-off between prediction error and adaptivity overhead. The proposed scheme is formulated as a constrained optimization problem that minimizes prediction error under sparsity constraints on the prediction coefficients. In order to solve this problem, a novel solver is introduced as an extension of LARS for multi-class support. Experiments show that the proposed scheme can reduce the amount of encoded bits by 1.21% to 3.24% on average compared to HM16.7.

  • Scalable Distributed Video Coding for Wireless Video Sensor Networks

    Hong YANG  Linbo QING  Xiaohai HE  Shuhua XIONG  

     
    PAPER

      Pubricized:
    2017/10/16
      Vol:
    E101-D No:1
      Page(s):
    20-27

    Wireless video sensor networks address problems, such as low power consumption of sensor nodes, low computing capacity of nodes, and unstable channel bandwidth. To transmit video of distributed video coding in wireless video sensor networks, we propose an efficient scalable distributed video coding scheme. In this scheme, the scalable Wyner-Ziv frame is based on transmission of different wavelet information, while the Key frame is based on transmission of different residual information. A successive refinement of side information for the Wyner-Ziv and Key frames are proposed in this scheme. Test results show that both the Wyner-Ziv and Key frames have four layers in quality and bit-rate scalable, but no increase in complexity of the encoder.

  • Hardware Oriented Low-Complexity Intra Coding Algorithm for SHVC

    Takafumi KATAYAMA  Tian SONG  Wen SHI  Gen FUJITA  Xiantao JIANG  Takashi SHIMAMOTO  

     
    PAPER-Digital Signal Processing

      Vol:
    E100-A No:12
      Page(s):
    2936-2947

    Scalable high efficiency video coding (SHVC) can provide variable video quality according to terminal devices. However, the computational complexity of SHVC is increased by introducing new techniques based on high efficiency video coding (HEVC). In this paper, a hardware oriented low complexity algorithm is proposed. The hardware oriented proposals have two key points. Firstly, the coding unit depth is determined by analyzing the boundary correlation between coding units before encoding process starts. Secondly, the redundant calculation of R-D optimization is reduced by adaptively using the information of the neighboring coding units and the co-located units in the base layer. The simulation results show that the proposed algorithm can achieve over 62% computation complexity reduction compared to the original SHM11.0. Compared with other related work, over 11% time saving have been achieved without PSNR loss. Furthermore, the proposed algorithm is hardware friendly which can be implemented in a small area.

  • Achievable Rate Regions of Cache-Aided Broadcast Networks for Delivering Content with a Multilayer Structure

    Tetsunao MATSUTA  Tomohiko UYEMATSU  

     
    PAPER-Shannon Theory

      Vol:
    E100-A No:12
      Page(s):
    2629-2640

    This paper deals with a broadcast network with a server and many users. The server has files of content such as music and videos, and each user requests one of these files, where each file consists of some separated layers like a file encoded by a scalable video coding. On the other hand, each user has a local memory, and a part of information of the files is cached (i.e., stored) in these memories in advance of users' requests. By using the cached information as side information, the server encodes files based on users' requests. Then, it sends a codeword through an error-free shared link for which all users can receive a common codeword from the server without error. We assume that the server transmits some layers up to a certain level of requested files at each different transmission rate (i.e., the codeword length per file size) corresponding to each level. In this paper, we focus on the region of tuples of these rates such that layers up to any level of requested files are recovered at users with an arbitrarily small error probability. Then, we give inner and outer bounds on this region.

  • Joint Transmission and Coding Scheme for High-Resolution Video Streams over Multiuser MIMO-OFDM Systems

    Koji TASHIRO  Leonardo LANANTE  Masayuki KUROSAKI  Hiroshi OCHI  

     
    PAPER-Communication Systems

      Vol:
    E100-A No:11
      Page(s):
    2304-2313

    High-resolution image and video communication in home networks is highly expected to proliferate with the spread of Wi-Fi devices and the introduction of multiple-input multiple-output (MIMO) systems. This paper proposes a joint transmission and coding scheme for broadcasting high-resolution video streams over multiuser MIMO systems with an eigenbeam-space division multiplexing (E-SDM) technique. Scalable video coding makes it possible to produce the code stream comprised of multiple layers having unequal contribution to image quality. The proposed scheme jointly assigns the data of scalable code streams to subcarriers and spatial streams based on their signal-to-noise ratio (SNR) values in order to transmit visually important data with high reliability. Simulation results show that the proposed scheme surpasses the conventional unequal power allocation (UPA) approach in terms of both peak signal-to-noise ratio (PSNR) of received images and correct decoding probability. PSNR performance of the proposed scheme exceeds 35dB with the probability of over 95% when received SNR is higher than 6dB. The improvement in average PSNR by the proposed scheme compared to the conventional UPA comes up to approx. 20dB at received SNR of 6dB. Furthermore, correct decoding probability reaches 95% when received SNR is greater than 4dB.

  • A High-Throughput and Compact Hardware Implementation for the Reconstruction Loop in HEVC Intra Encoding

    Yibo FAN  Leilei HUANG  Zheng XIE  Xiaoyang ZENG  

     
    PAPER-Integrated Electronics

      Vol:
    E100-C No:6
      Page(s):
    643-654

    In the newly finalized video coding standard, namely high efficiency video coding (HEVC), new notations like coding unit (CU), prediction unit (PU) and transformation unit (TU) are introduced to improve the coding performance. As a result, the reconstruction loop in intra encoding is heavily burdened to choose the best partitions or modes for them. In order to solve the bottleneck problems in cycle and hardware cost, this paper proposed a high-throughput and compact implementation for such a reconstruction loop. By “high-throughput”, it refers to that it has a fixed throughput of 32 pixel/cycle independent of the TU/PU size (except for 4×4 TUs). By “compact”, it refers to that it fully explores the reusability between discrete cosine transform (DCT) and inverse discrete cosine transform (IDCT) as well as that between quantization (Q) and de-quantization (IQ). Besides the contributions made in designing related hardware, this paper also provides a universal formula to analyze the cycle cost of the reconstruction loop and proposed a parallel-process scheme to further reduce the cycle cost. This design is verified on the Stratix IV FPGA. The basic structure achieved a maximum frequency of 150MHz and a hardware cost of 64K ALUTs, which could support the real time TU/PU partition decision for 4K×2K@20fps videos.

1-20hit(141hit)