Efficient Patch Merging for Atlas Construction in 3DoF+ Video Coding

Hyun-Ho KIM; Sung-Gyun LIM; Gwangsoon LEE; Jun Young JEONG; Jae-Gon KIM

doi:10.1587/transinf.2020EDL8119

IEICE TRANSACTIONS on Information

Efficient Patch Merging for Atlas Construction in 3DoF+ Video Coding

Hyun-Ho KIM, Sung-Gyun LIM, Gwangsoon LEE, Jun Young JEONG, Jae-Gon KIM

Full Text Views

0

Cite this

Summary :

The emerging three degree of freedom plus (3DoF+) video provides more interactive and deep immersive visual experience. 3DoF+ video introduces motion parallax to 360 video providing omnidirectional view with limited changes of the view position. A large set of views are required to support such 3DoF+ visual experience, hence it is essential to compress a tremendous amount of 3DoF+ video. Recently, MPEG is developing a standard for efficient coding of 3DoF+ video that consists of multiple videos, and its test model named Test Model for Immersive Video (TMIV). In the TMIV, the redundancy between the input source views is removed as much as possible by selecting one or several basic views and predicting the remaining views from the basic views. Each unpredicted region is cropped to a bounding box called patch, and then a large number of patches are packed into atlases together with the selected basic views. As a result, multiple source views are converted into one or more atlas sequences to be compressed. In this letter, we present an improved clustering method using patch merging in the atlas construction in the TMIV. The proposed method achieves significant BD-rate reduction in terms of various end-to-end evaluation metrics in the experiment, and was adopted in TMIV6.0.

Publication: IEICE TRANSACTIONS on Information Vol.E104-D No.3 pp.477-480

Publication Date: 2021/03/01

Publicized: 2020/12/14

Online ISSN: 1745-1361

DOI: 10.1587/transinf.2020EDL8119

Type of Manuscript: LETTER

Category: Image Processing and Video Processing

Authors

Hyun-Ho KIM
  Korea Aerospace University
Sung-Gyun LIM
  Korea Aerospace University
Gwangsoon LEE
  ETRI
Jun Young JEONG
  ETRI
Jae-Gon KIM
  Korea Aerospace University

Keyword

MPEG-I, immersive video, 360 video, VR, 3DoF+, TMIV

Cite this

Copy

Hyun-Ho KIM, Sung-Gyun LIM, Gwangsoon LEE, Jun Young JEONG, Jae-Gon KIM, "Efficient Patch Merging for Atlas Construction in 3DoF+ Video Coding" in IEICE TRANSACTIONS on Information, vol. E104-D, no. 3, pp. 477-480, March 2021, doi: 10.1587/transinf.2020EDL8119.
Abstract: The emerging three degree of freedom plus (3DoF+) video provides more interactive and deep immersive visual experience. 3DoF+ video introduces motion parallax to 360 video providing omnidirectional view with limited changes of the view position. A large set of views are required to support such 3DoF+ visual experience, hence it is essential to compress a tremendous amount of 3DoF+ video. Recently, MPEG is developing a standard for efficient coding of 3DoF+ video that consists of multiple videos, and its test model named Test Model for Immersive Video (TMIV). In the TMIV, the redundancy between the input source views is removed as much as possible by selecting one or several basic views and predicting the remaining views from the basic views. Each unpredicted region is cropped to a bounding box called patch, and then a large number of patches are packed into atlases together with the selected basic views. As a result, multiple source views are converted into one or more atlas sequences to be compressed. In this letter, we present an improved clustering method using patch merging in the atlas construction in the TMIV. The proposed method achieves significant BD-rate reduction in terms of various end-to-end evaluation metrics in the experiment, and was adopted in TMIV6.0.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2020EDL8119/_p

Copy

@ARTICLE{e104-d_3_477,
author={Hyun-Ho KIM, Sung-Gyun LIM, Gwangsoon LEE, Jun Young JEONG, Jae-Gon KIM, },
journal={IEICE TRANSACTIONS on Information},
title={Efficient Patch Merging for Atlas Construction in 3DoF+ Video Coding},
year={2021},
volume={E104-D},
number={3},
pages={477-480},
abstract={The emerging three degree of freedom plus (3DoF+) video provides more interactive and deep immersive visual experience. 3DoF+ video introduces motion parallax to 360 video providing omnidirectional view with limited changes of the view position. A large set of views are required to support such 3DoF+ visual experience, hence it is essential to compress a tremendous amount of 3DoF+ video. Recently, MPEG is developing a standard for efficient coding of 3DoF+ video that consists of multiple videos, and its test model named Test Model for Immersive Video (TMIV). In the TMIV, the redundancy between the input source views is removed as much as possible by selecting one or several basic views and predicting the remaining views from the basic views. Each unpredicted region is cropped to a bounding box called patch, and then a large number of patches are packed into atlases together with the selected basic views. As a result, multiple source views are converted into one or more atlas sequences to be compressed. In this letter, we present an improved clustering method using patch merging in the atlas construction in the TMIV. The proposed method achieves significant BD-rate reduction in terms of various end-to-end evaluation metrics in the experiment, and was adopted in TMIV6.0.},
keywords={},
doi={10.1587/transinf.2020EDL8119},
ISSN={1745-1361},
month={March},}

Copy

TY - JOUR
TI - Efficient Patch Merging for Atlas Construction in 3DoF+ Video Coding
T2 - IEICE TRANSACTIONS on Information
SP - 477
EP - 480
AU - Hyun-Ho KIM
AU - Sung-Gyun LIM
AU - Gwangsoon LEE
AU - Jun Young JEONG
AU - Jae-Gon KIM
PY - 2021
DO - 10.1587/transinf.2020EDL8119
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E104-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2021
AB - The emerging three degree of freedom plus (3DoF+) video provides more interactive and deep immersive visual experience. 3DoF+ video introduces motion parallax to 360 video providing omnidirectional view with limited changes of the view position. A large set of views are required to support such 3DoF+ visual experience, hence it is essential to compress a tremendous amount of 3DoF+ video. Recently, MPEG is developing a standard for efficient coding of 3DoF+ video that consists of multiple videos, and its test model named Test Model for Immersive Video (TMIV). In the TMIV, the redundancy between the input source views is removed as much as possible by selecting one or several basic views and predicting the remaining views from the basic views. Each unpredicted region is cropped to a bounding box called patch, and then a large number of patches are packed into atlases together with the selected basic views. As a result, multiple source views are converted into one or more atlas sequences to be compressed. In this letter, we present an improved clustering method using patch merging in the atlas construction in the TMIV. The proposed method achieves significant BD-rate reduction in terms of various end-to-end evaluation metrics in the experiment, and was adopted in TMIV6.0.
ER -

IEICE TRANSACTIONS on Information