The emerging three degree of freedom plus (3DoF+) video provides more interactive and deep immersive visual experience. 3DoF+ video introduces motion parallax to 360 video providing omnidirectional view with limited changes of the view position. A large set of views are required to support such 3DoF+ visual experience, hence it is essential to compress a tremendous amount of 3DoF+ video. Recently, MPEG is developing a standard for efficient coding of 3DoF+ video that consists of multiple videos, and its test model named Test Model for Immersive Video (TMIV). In the TMIV, the redundancy between the input source views is removed as much as possible by selecting one or several basic views and predicting the remaining views from the basic views. Each unpredicted region is cropped to a bounding box called patch, and then a large number of patches are packed into atlases together with the selected basic views. As a result, multiple source views are converted into one or more atlas sequences to be compressed. In this letter, we present an improved clustering method using patch merging in the atlas construction in the TMIV. The proposed method achieves significant BD-rate reduction in terms of various end-to-end evaluation metrics in the experiment, and was adopted in TMIV6.0.
Hyun-Ho KIM
Korea Aerospace University
Sung-Gyun LIM
Korea Aerospace University
Gwangsoon LEE
ETRI
Jun Young JEONG
ETRI
Jae-Gon KIM
Korea Aerospace University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Hyun-Ho KIM, Sung-Gyun LIM, Gwangsoon LEE, Jun Young JEONG, Jae-Gon KIM, "Efficient Patch Merging for Atlas Construction in 3DoF+ Video Coding" in IEICE TRANSACTIONS on Information,
vol. E104-D, no. 3, pp. 477-480, March 2021, doi: 10.1587/transinf.2020EDL8119.
Abstract: The emerging three degree of freedom plus (3DoF+) video provides more interactive and deep immersive visual experience. 3DoF+ video introduces motion parallax to 360 video providing omnidirectional view with limited changes of the view position. A large set of views are required to support such 3DoF+ visual experience, hence it is essential to compress a tremendous amount of 3DoF+ video. Recently, MPEG is developing a standard for efficient coding of 3DoF+ video that consists of multiple videos, and its test model named Test Model for Immersive Video (TMIV). In the TMIV, the redundancy between the input source views is removed as much as possible by selecting one or several basic views and predicting the remaining views from the basic views. Each unpredicted region is cropped to a bounding box called patch, and then a large number of patches are packed into atlases together with the selected basic views. As a result, multiple source views are converted into one or more atlas sequences to be compressed. In this letter, we present an improved clustering method using patch merging in the atlas construction in the TMIV. The proposed method achieves significant BD-rate reduction in terms of various end-to-end evaluation metrics in the experiment, and was adopted in TMIV6.0.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2020EDL8119/_p
Copy
@ARTICLE{e104-d_3_477,
author={Hyun-Ho KIM, Sung-Gyun LIM, Gwangsoon LEE, Jun Young JEONG, Jae-Gon KIM, },
journal={IEICE TRANSACTIONS on Information},
title={Efficient Patch Merging for Atlas Construction in 3DoF+ Video Coding},
year={2021},
volume={E104-D},
number={3},
pages={477-480},
abstract={The emerging three degree of freedom plus (3DoF+) video provides more interactive and deep immersive visual experience. 3DoF+ video introduces motion parallax to 360 video providing omnidirectional view with limited changes of the view position. A large set of views are required to support such 3DoF+ visual experience, hence it is essential to compress a tremendous amount of 3DoF+ video. Recently, MPEG is developing a standard for efficient coding of 3DoF+ video that consists of multiple videos, and its test model named Test Model for Immersive Video (TMIV). In the TMIV, the redundancy between the input source views is removed as much as possible by selecting one or several basic views and predicting the remaining views from the basic views. Each unpredicted region is cropped to a bounding box called patch, and then a large number of patches are packed into atlases together with the selected basic views. As a result, multiple source views are converted into one or more atlas sequences to be compressed. In this letter, we present an improved clustering method using patch merging in the atlas construction in the TMIV. The proposed method achieves significant BD-rate reduction in terms of various end-to-end evaluation metrics in the experiment, and was adopted in TMIV6.0.},
keywords={},
doi={10.1587/transinf.2020EDL8119},
ISSN={1745-1361},
month={March},}
Copy
TY - JOUR
TI - Efficient Patch Merging for Atlas Construction in 3DoF+ Video Coding
T2 - IEICE TRANSACTIONS on Information
SP - 477
EP - 480
AU - Hyun-Ho KIM
AU - Sung-Gyun LIM
AU - Gwangsoon LEE
AU - Jun Young JEONG
AU - Jae-Gon KIM
PY - 2021
DO - 10.1587/transinf.2020EDL8119
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E104-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2021
AB - The emerging three degree of freedom plus (3DoF+) video provides more interactive and deep immersive visual experience. 3DoF+ video introduces motion parallax to 360 video providing omnidirectional view with limited changes of the view position. A large set of views are required to support such 3DoF+ visual experience, hence it is essential to compress a tremendous amount of 3DoF+ video. Recently, MPEG is developing a standard for efficient coding of 3DoF+ video that consists of multiple videos, and its test model named Test Model for Immersive Video (TMIV). In the TMIV, the redundancy between the input source views is removed as much as possible by selecting one or several basic views and predicting the remaining views from the basic views. Each unpredicted region is cropped to a bounding box called patch, and then a large number of patches are packed into atlases together with the selected basic views. As a result, multiple source views are converted into one or more atlas sequences to be compressed. In this letter, we present an improved clustering method using patch merging in the atlas construction in the TMIV. The proposed method achieves significant BD-rate reduction in terms of various end-to-end evaluation metrics in the experiment, and was adopted in TMIV6.0.
ER -