The search functionality is under construction.

IEICE TRANSACTIONS on Information

Reward-Based Exploration: Adaptive Control for Deep Reinforcement Learning

Zhi-xiong XU, Lei CAO, Xi-liang CHEN, Chen-xi LI

  • Full Text Views

    0

  • Cite this

Summary :

Aiming at the contradiction between exploration and exploitation in deep reinforcement learning, this paper proposes “reward-based exploration strategy combined with Softmax action selection” (RBE-Softmax) as a dynamic exploration strategy to guide the agent to learn. The superiority of the proposed method is that the characteristic of agent's learning process is utilized to adapt exploration parameters online, and the agent is able to select potential optimal action more effectively. The proposed method is evaluated in discrete and continuous control tasks on OpenAI Gym, and the empirical evaluation results show that RBE-Softmax method leads to statistically-significant improvement in the performance of deep reinforcement learning algorithms.

Publication
IEICE TRANSACTIONS on Information Vol.E101-D No.9 pp.2409-2412
Publication Date
2018/09/01
Publicized
2018/06/18
Online ISSN
1745-1361
DOI
10.1587/transinf.2018EDL8011
Type of Manuscript
LETTER
Category
Artificial Intelligence, Data Mining

Authors

Zhi-xiong XU
  Army Engineering University
Lei CAO
  Army Engineering University
Xi-liang CHEN
  Army Engineering University
Chen-xi LI
  Army Engineering University

Keyword