In a multi-agent system, it is important to consider a design method of cooperative actions in order to achieve a common goal. In this paper, we propose two novel multi-agent reinforcement learning methods, where the control specification is described by linear temporal logic formulas, which represent a common goal. First, we propose a simple solution method, which is directly extended from the single-agent case. In this method, there are some technical issues caused by the increase in the number of agents. Next, to overcome these technical issues, we propose a new method in which an aggregator is introduced. Finally, these two methods are compared by numerical simulations, with a surveillance problem as an example.
Keita TERASHIMA
Hokkaido University
Koichi KOBAYASHI
Hokkaido University
Yuh YAMASHITA
Hokkaido University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Keita TERASHIMA, Koichi KOBAYASHI, Yuh YAMASHITA, "Reinforcement Learning for Multi-Agent Systems with Temporal Logic Specifications" in IEICE TRANSACTIONS on Fundamentals,
vol. E107-A, no. 1, pp. 31-37, January 2024, doi: 10.1587/transfun.2023KEP0016.
Abstract: In a multi-agent system, it is important to consider a design method of cooperative actions in order to achieve a common goal. In this paper, we propose two novel multi-agent reinforcement learning methods, where the control specification is described by linear temporal logic formulas, which represent a common goal. First, we propose a simple solution method, which is directly extended from the single-agent case. In this method, there are some technical issues caused by the increase in the number of agents. Next, to overcome these technical issues, we propose a new method in which an aggregator is introduced. Finally, these two methods are compared by numerical simulations, with a surveillance problem as an example.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/transfun.2023KEP0016/_p
Copy
@ARTICLE{e107-a_1_31,
author={Keita TERASHIMA, Koichi KOBAYASHI, Yuh YAMASHITA, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Reinforcement Learning for Multi-Agent Systems with Temporal Logic Specifications},
year={2024},
volume={E107-A},
number={1},
pages={31-37},
abstract={In a multi-agent system, it is important to consider a design method of cooperative actions in order to achieve a common goal. In this paper, we propose two novel multi-agent reinforcement learning methods, where the control specification is described by linear temporal logic formulas, which represent a common goal. First, we propose a simple solution method, which is directly extended from the single-agent case. In this method, there are some technical issues caused by the increase in the number of agents. Next, to overcome these technical issues, we propose a new method in which an aggregator is introduced. Finally, these two methods are compared by numerical simulations, with a surveillance problem as an example.},
keywords={},
doi={10.1587/transfun.2023KEP0016},
ISSN={1745-1337},
month={January},}
Copy
TY - JOUR
TI - Reinforcement Learning for Multi-Agent Systems with Temporal Logic Specifications
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 31
EP - 37
AU - Keita TERASHIMA
AU - Koichi KOBAYASHI
AU - Yuh YAMASHITA
PY - 2024
DO - 10.1587/transfun.2023KEP0016
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E107-A
IS - 1
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - January 2024
AB - In a multi-agent system, it is important to consider a design method of cooperative actions in order to achieve a common goal. In this paper, we propose two novel multi-agent reinforcement learning methods, where the control specification is described by linear temporal logic formulas, which represent a common goal. First, we propose a simple solution method, which is directly extended from the single-agent case. In this method, there are some technical issues caused by the increase in the number of agents. Next, to overcome these technical issues, we propose a new method in which an aggregator is introduced. Finally, these two methods are compared by numerical simulations, with a surveillance problem as an example.
ER -