IEICE global.ieice.org Site

Author Search Result

[Author] Lei CAO(3hit)

1-3hit

Deep Reinforcement Learning with Sarsa and Q-Learning: A Hybrid Approach
Zhi-xiong XU Lei CAO Xi-liang CHEN Chen-xi LI Yong-liang ZHANG Jun LAI

PAPER-Artificial Intelligence, Data Mining

Pubricized:
2018/05/22
Vol:
E101-D No:9
Page(s):
2315-2322
The commonly used Deep Q Networks is known to overestimate action values under certain conditions. It's also proved that overestimations do harm to performance, which might cause instability and divergence of learning. In this paper, we present the Deep Sarsa and Q Networks (DSQN) algorithm, which can considered as an enhancement to the Deep Q Networks algorithm. First, DSQN algorithm takes advantage of the experience replay and target network techniques in Deep Q Networks to improve the stability of neural networks. Second, double estimator is utilized for Q-learning to reduce overestimations. Especially, we introduce Sarsa learning to Deep Q Networks for removing overestimations further. Finally, DSQN algorithm is evaluated on cart-pole balancing, mountain car and lunarlander control task from the OpenAI Gym. The empirical evaluation results show that the proposed method leads to reduced overestimations, more stable learning process and improved performance.
Reward-Based Exploration: Adaptive Control for Deep Reinforcement Learning
Zhi-xiong XU Lei CAO Xi-liang CHEN Chen-xi LI

LETTER-Artificial Intelligence, Data Mining

Pubricized:
2018/06/18
Vol:
E101-D No:9
Page(s):
2409-2412
Aiming at the contradiction between exploration and exploitation in deep reinforcement learning, this paper proposes “reward-based exploration strategy combined with Softmax action selection” (RBE-Softmax) as a dynamic exploration strategy to guide the agent to learn. The superiority of the proposed method is that the characteristic of agent's learning process is utilized to adapt exploration parameters online, and the agent is able to select potential optimal action more effectively. The proposed method is evaluated in discrete and continuous control tasks on OpenAI Gym, and the empirical evaluation results show that RBE-Softmax method leads to statistically-significant improvement in the performance of deep reinforcement learning algorithms.
A Study of Qualitative Knowledge-Based Exploration for Continuous Deep Reinforcement Learning
Chenxi LI Lei CAO Xiaoming LIU Xiliang CHEN Zhixiong XU Yongliang ZHANG

LETTER-Artificial Intelligence, Data Mining

Pubricized:
2017/07/26
Vol:
E100-D No:11
Page(s):
2721-2724
As an important method to solve sequential decision-making problems, reinforcement learning learns the policy of tasks through the interaction with environment. But it has difficulties scaling to large-scale problems. One of the reasons is the exploration and exploitation dilemma which may lead to inefficient learning. We present an approach that addresses this shortcoming by introducing qualitative knowledge into reinforcement learning using cloud control systems to represent ‘if-then’ rules. We use it as the heuristics exploration strategy to guide the action selection in deep reinforcement learning. Empirical evaluation results show that our approach can make significant improvement in the learning process.

Author Search Result

[Author] Lei CAO(3hit)

Deep Reinforcement Learning with Sarsa and Q-Learning: A Hybrid Approach

Reward-Based Exploration: Adaptive Control for Deep Reinforcement Learning

A Study of Qualitative Knowledge-Based Exploration for Continuous Deep Reinforcement Learning

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles