The search functionality is under construction.

Keyword Search Result

[Keyword] Sarsa(2hit)

1-2hit
  • Deep Reinforcement Learning with Sarsa and Q-Learning: A Hybrid Approach

    Zhi-xiong XU  Lei CAO  Xi-liang CHEN  Chen-xi LI  Yong-liang ZHANG  Jun LAI  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2018/05/22
      Vol:
    E101-D No:9
      Page(s):
    2315-2322

    The commonly used Deep Q Networks is known to overestimate action values under certain conditions. It's also proved that overestimations do harm to performance, which might cause instability and divergence of learning. In this paper, we present the Deep Sarsa and Q Networks (DSQN) algorithm, which can considered as an enhancement to the Deep Q Networks algorithm. First, DSQN algorithm takes advantage of the experience replay and target network techniques in Deep Q Networks to improve the stability of neural networks. Second, double estimator is utilized for Q-learning to reduce overestimations. Especially, we introduce Sarsa learning to Deep Q Networks for removing overestimations further. Finally, DSQN algorithm is evaluated on cart-pole balancing, mountain car and lunarlander control task from the OpenAI Gym. The empirical evaluation results show that the proposed method leads to reduced overestimations, more stable learning process and improved performance.

  • Sarsa Learning Based Route Guidance System with Global and Local Parameter Strategy

    Feng WEN  Xingqiao WANG  

     
    PAPER-Intelligent Transport System

      Vol:
    E98-A No:12
      Page(s):
    2686-2693

    Route guidance system is one of the essential components of a vehicle navigation system in ITS. In this paper, a centrally determined route guidance system is established to solve congestion problems. The Sarsa learning method is used to guide vehicles, and global and local parameter strategy is proposed to adjust the vehicle guidance by considering the whole traffic system and local traffic environment, respectively. The proposed method can save the average driving time and relieve traffic congestion. The evaluation was done using two cases on different road networks. The experimental results show the efficiency and effectiveness of the proposed algorithm.