Semantic Path Planning for Indoor Navigation Tasks Using Multi-View Context and Prior Knowledge

Jianbing WU; Weibo HUANG; Guoliang HUA; Wanruo ZHANG; Risheng KANG; Hong LIU

doi:10.1587/transinf.2022DLP0033

IEICE TRANSACTIONS on Information

Semantic Path Planning for Indoor Navigation Tasks Using Multi-View Context and Prior Knowledge

Jianbing WU, Weibo HUANG, Guoliang HUA, Wanruo ZHANG, Risheng KANG, Hong LIU

Full Text Views

7

Cite this

Summary :

Recently, deep reinforcement learning (DRL) methods have significantly improved the performance of target-driven indoor navigation tasks. However, the rich semantic information of environments is still not fully exploited in previous approaches. In addition, existing methods usually tend to overfit on training scenes or objects in target-driven navigation tasks, making it hard to generalize to unseen environments. Human beings can easily adapt to new scenes as they can recognize the objects they see and reason the possible locations of target objects using their experience. Inspired by this, we propose a DRL-based target-driven navigation model, termed MVC-PK, using Multi-View Context information and Prior semantic Knowledge. It relies only on the semantic label of target objects and allows the robot to find the target without using any geometry map. To perceive the semantic contextual information in the environment, object detectors are leveraged to detect the objects present in the multi-view observations. To enable the semantic reasoning ability of indoor mobile robots, a Graph Convolutional Network is also employed to incorporate prior knowledge. The proposed MVC-PK model is evaluated in the AI2-THOR simulation environment. The results show that MVC-PK (1) significantly improves the cross-scene and cross-target generalization ability, and (2) achieves state-of-the-art performance with 15.2% and 11.0% increase in Success Rate (SR) and Success weighted by Path Length (SPL), respectively.

Publication: IEICE TRANSACTIONS on Information Vol.E106-D No.5 pp.756-764

Publication Date: 2023/05/01

Publicized: 2022/01/20

Online ISSN: 1745-1361

DOI: 10.1587/transinf.2022DLP0033

Type of Manuscript: Special Section PAPER (Special Section on Deep Learning Technologies: Architecture, Optimization, Techniques, and Applications)

Category: Positioning and Navigation

Authors

Jianbing WU
  Peking University
Weibo HUANG
  Peking University
Guoliang HUA
  Peking University
Wanruo ZHANG
  Peking University
Risheng KANG
  Department of Mechanical Engineering
Hong LIU
  Peking University

Keyword

inteligent robot, visual semantic navigation, deep reinforcement learning, graph convolutional network

Cite this

Copy

Jianbing WU, Weibo HUANG, Guoliang HUA, Wanruo ZHANG, Risheng KANG, Hong LIU, "Semantic Path Planning for Indoor Navigation Tasks Using Multi-View Context and Prior Knowledge" in IEICE TRANSACTIONS on Information, vol. E106-D, no. 5, pp. 756-764, May 2023, doi: 10.1587/transinf.2022DLP0033.
Abstract: Recently, deep reinforcement learning (DRL) methods have significantly improved the performance of target-driven indoor navigation tasks. However, the rich semantic information of environments is still not fully exploited in previous approaches. In addition, existing methods usually tend to overfit on training scenes or objects in target-driven navigation tasks, making it hard to generalize to unseen environments. Human beings can easily adapt to new scenes as they can recognize the objects they see and reason the possible locations of target objects using their experience. Inspired by this, we propose a DRL-based target-driven navigation model, termed MVC-PK, using Multi-View Context information and Prior semantic Knowledge. It relies only on the semantic label of target objects and allows the robot to find the target without using any geometry map. To perceive the semantic contextual information in the environment, object detectors are leveraged to detect the objects present in the multi-view observations. To enable the semantic reasoning ability of indoor mobile robots, a Graph Convolutional Network is also employed to incorporate prior knowledge. The proposed MVC-PK model is evaluated in the AI2-THOR simulation environment. The results show that MVC-PK (1) significantly improves the cross-scene and cross-target generalization ability, and (2) achieves state-of-the-art performance with 15.2% and 11.0% increase in Success Rate (SR) and Success weighted by Path Length (SPL), respectively.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2022DLP0033/_p

Copy

@ARTICLE{e106-d_5_756,
author={Jianbing WU, Weibo HUANG, Guoliang HUA, Wanruo ZHANG, Risheng KANG, Hong LIU, },
journal={IEICE TRANSACTIONS on Information},
title={Semantic Path Planning for Indoor Navigation Tasks Using Multi-View Context and Prior Knowledge},
year={2023},
volume={E106-D},
number={5},
pages={756-764},
abstract={Recently, deep reinforcement learning (DRL) methods have significantly improved the performance of target-driven indoor navigation tasks. However, the rich semantic information of environments is still not fully exploited in previous approaches. In addition, existing methods usually tend to overfit on training scenes or objects in target-driven navigation tasks, making it hard to generalize to unseen environments. Human beings can easily adapt to new scenes as they can recognize the objects they see and reason the possible locations of target objects using their experience. Inspired by this, we propose a DRL-based target-driven navigation model, termed MVC-PK, using Multi-View Context information and Prior semantic Knowledge. It relies only on the semantic label of target objects and allows the robot to find the target without using any geometry map. To perceive the semantic contextual information in the environment, object detectors are leveraged to detect the objects present in the multi-view observations. To enable the semantic reasoning ability of indoor mobile robots, a Graph Convolutional Network is also employed to incorporate prior knowledge. The proposed MVC-PK model is evaluated in the AI2-THOR simulation environment. The results show that MVC-PK (1) significantly improves the cross-scene and cross-target generalization ability, and (2) achieves state-of-the-art performance with 15.2% and 11.0% increase in Success Rate (SR) and Success weighted by Path Length (SPL), respectively.},
keywords={},
doi={10.1587/transinf.2022DLP0033},
ISSN={1745-1361},
month={May},}

Copy

TY - JOUR
TI - Semantic Path Planning for Indoor Navigation Tasks Using Multi-View Context and Prior Knowledge
T2 - IEICE TRANSACTIONS on Information
SP - 756
EP - 764
AU - Jianbing WU
AU - Weibo HUANG
AU - Guoliang HUA
AU - Wanruo ZHANG
AU - Risheng KANG
AU - Hong LIU
PY - 2023
DO - 10.1587/transinf.2022DLP0033
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E106-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2023
AB - Recently, deep reinforcement learning (DRL) methods have significantly improved the performance of target-driven indoor navigation tasks. However, the rich semantic information of environments is still not fully exploited in previous approaches. In addition, existing methods usually tend to overfit on training scenes or objects in target-driven navigation tasks, making it hard to generalize to unseen environments. Human beings can easily adapt to new scenes as they can recognize the objects they see and reason the possible locations of target objects using their experience. Inspired by this, we propose a DRL-based target-driven navigation model, termed MVC-PK, using Multi-View Context information and Prior semantic Knowledge. It relies only on the semantic label of target objects and allows the robot to find the target without using any geometry map. To perceive the semantic contextual information in the environment, object detectors are leveraged to detect the objects present in the multi-view observations. To enable the semantic reasoning ability of indoor mobile robots, a Graph Convolutional Network is also employed to incorporate prior knowledge. The proposed MVC-PK model is evaluated in the AI2-THOR simulation environment. The results show that MVC-PK (1) significantly improves the cross-scene and cross-target generalization ability, and (2) achieves state-of-the-art performance with 15.2% and 11.0% increase in Success Rate (SR) and Success weighted by Path Length (SPL), respectively.
ER -

IEICE TRANSACTIONS on Information