Multi-Scale Correspondence Learning for Person Image Generation

Shi-Long SHEN; Ai-Guo WU; Yong XU

doi:10.1587/transinf.2022DLP0058

IEICE TRANSACTIONS on Information

Multi-Scale Correspondence Learning for Person Image Generation

Shi-Long SHEN, Ai-Guo WU, Yong XU

Full Text Views

3

Cite this

Summary :

A generative model is presented for two types of person image generation in this paper. First, this model is applied to pose-guided person image generation, i.e., converting the pose of a source person image to the target pose while preserving the texture of that source person image. Second, this model is also used for clothing-guided person image generation, i.e., changing the clothing texture of a source person image to the desired clothing texture. The core idea of the proposed model is to establish the multi-scale correspondence, which can effectively address the misalignment introduced by transferring pose, thereby preserving richer information on appearance. Specifically, the proposed model consists of two stages: 1) It first generates the target semantic map imposed on the target pose to provide more accurate guidance during the generation process. 2) After obtaining the multi-scale feature map by the encoder, the multi-scale correspondence is established, which is useful for a fine-grained generation. Experimental results show the proposed method is superior to state-of-the-art methods in pose-guided person image generation and show its effectiveness in clothing-guided person image generation.

Publication: IEICE TRANSACTIONS on Information Vol.E106-D No.5 pp.804-812

Publication Date: 2023/05/01

Publicized: 2022/04/15

Online ISSN: 1745-1361

DOI: 10.1587/transinf.2022DLP0058

Type of Manuscript: Special Section PAPER (Special Section on Deep Learning Technologies: Architecture, Optimization, Techniques, and Applications)

Category: Person Image Generation

Authors

Shi-Long SHEN
  Harbin Institute of Technology (Shenzhen)
Ai-Guo WU
  Harbin Institute of Technology (Shenzhen)
Yong XU
  Shenzhen Key Laboratory of Visual Object Detection and Recognition

Keyword

generative models, generative adversarial networks, person image generation

Cite this

Copy

Shi-Long SHEN, Ai-Guo WU, Yong XU, "Multi-Scale Correspondence Learning for Person Image Generation" in IEICE TRANSACTIONS on Information, vol. E106-D, no. 5, pp. 804-812, May 2023, doi: 10.1587/transinf.2022DLP0058.
Abstract: A generative model is presented for two types of person image generation in this paper. First, this model is applied to pose-guided person image generation, i.e., converting the pose of a source person image to the target pose while preserving the texture of that source person image. Second, this model is also used for clothing-guided person image generation, i.e., changing the clothing texture of a source person image to the desired clothing texture. The core idea of the proposed model is to establish the multi-scale correspondence, which can effectively address the misalignment introduced by transferring pose, thereby preserving richer information on appearance. Specifically, the proposed model consists of two stages: 1) It first generates the target semantic map imposed on the target pose to provide more accurate guidance during the generation process. 2) After obtaining the multi-scale feature map by the encoder, the multi-scale correspondence is established, which is useful for a fine-grained generation. Experimental results show the proposed method is superior to state-of-the-art methods in pose-guided person image generation and show its effectiveness in clothing-guided person image generation.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2022DLP0058/_p

Copy

@ARTICLE{e106-d_5_804,
author={Shi-Long SHEN, Ai-Guo WU, Yong XU, },
journal={IEICE TRANSACTIONS on Information},
title={Multi-Scale Correspondence Learning for Person Image Generation},
year={2023},
volume={E106-D},
number={5},
pages={804-812},
abstract={A generative model is presented for two types of person image generation in this paper. First, this model is applied to pose-guided person image generation, i.e., converting the pose of a source person image to the target pose while preserving the texture of that source person image. Second, this model is also used for clothing-guided person image generation, i.e., changing the clothing texture of a source person image to the desired clothing texture. The core idea of the proposed model is to establish the multi-scale correspondence, which can effectively address the misalignment introduced by transferring pose, thereby preserving richer information on appearance. Specifically, the proposed model consists of two stages: 1) It first generates the target semantic map imposed on the target pose to provide more accurate guidance during the generation process. 2) After obtaining the multi-scale feature map by the encoder, the multi-scale correspondence is established, which is useful for a fine-grained generation. Experimental results show the proposed method is superior to state-of-the-art methods in pose-guided person image generation and show its effectiveness in clothing-guided person image generation.},
keywords={},
doi={10.1587/transinf.2022DLP0058},
ISSN={1745-1361},
month={May},}

Copy

TY - JOUR
TI - Multi-Scale Correspondence Learning for Person Image Generation
T2 - IEICE TRANSACTIONS on Information
SP - 804
EP - 812
AU - Shi-Long SHEN
AU - Ai-Guo WU
AU - Yong XU
PY - 2023
DO - 10.1587/transinf.2022DLP0058
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E106-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2023
AB - A generative model is presented for two types of person image generation in this paper. First, this model is applied to pose-guided person image generation, i.e., converting the pose of a source person image to the target pose while preserving the texture of that source person image. Second, this model is also used for clothing-guided person image generation, i.e., changing the clothing texture of a source person image to the desired clothing texture. The core idea of the proposed model is to establish the multi-scale correspondence, which can effectively address the misalignment introduced by transferring pose, thereby preserving richer information on appearance. Specifically, the proposed model consists of two stages: 1) It first generates the target semantic map imposed on the target pose to provide more accurate guidance during the generation process. 2) After obtaining the multi-scale feature map by the encoder, the multi-scale correspondence is established, which is useful for a fine-grained generation. Experimental results show the proposed method is superior to state-of-the-art methods in pose-guided person image generation and show its effectiveness in clothing-guided person image generation.
ER -

IEICE TRANSACTIONS on Information