1-3hit |
Alex VALDIVIELSO CHIAN Toshiyuki MIYAMOTO
In this letter, we introduce a knowledge reuse method to improve the performance of a learning algorithm developed to prevent interference in multi-car elevators. This method enables the algorithm to use its previously acquired experience in new learning processes. The simulation results confirm the improvement achieved in the algorithm's performance.
Alex VALDIVIELSO CHIAN Toshiyuki MIYAMOTO
In this letter, we present the evaluation of an option-based learning algorithm, developed to perform a conflict-free allocation of calls among cars in a multi-car elevator system. We evaluate its performance in terms of the service time, its flexibility in the task-allocation, and the load balancing.
Alex VALDIVIELSO Toshiyuki MIYAMOTO
In automated transport applications, the design of a task allocation policy becomes a complex problem when there are several agents in the system and conflicts between them may arise, affecting the system's performance. In this situation, to achieve a globally optimal result would require the complete knowledge of the system's model, which is infeasible for real systems with huge state spaces and unknown state-transition probabilities. Reinforcement Learning (RL) methods have done well approximating optimal results in the processing of tasks, without requiring previous knowledge of the system's model. However, to our knowledge, there are not many RL methods focused on the task allocation problem in transportation systems, and even fewer directly used to allocate tasks, considering the risk of conflicts between agents. In this paper, we propose an option-based RL algorithm with conditioned updating to make agents learn a task allocation policy to complete tasks while preventing conflicts between them. We use a multicar elevator (MCE) system as test application. Simulation results show that with our algorithm, elevator cars in the same shaft effectively learn to respond to service calls without interfering with each other, under different passenger arrival rates, and system configurations.