The search functionality is under construction.

The search functionality is under construction.

This paper describes the *Profit-Sharing*, a reinforcement learning approach which can be used to design a coordination strategy in a multi-agent system, and demonstrates its effectiveness empirically within a coil-yard of steel manufacture. This domain consists of multiple cranes which are operated asynchronously but need coordination to adjust their initial plans of task execution to avoid the collisions, which would be caused by resource limitation. This problem is beyond the classical expert's hand-coding methods as well as the mathematical analysis, because of scattered information, stochastically generated tasks, and moreover, the difficulties to transact tasks on schedule. In recent few years, many applications of reinforcement learning algorithms based on *Dynamic Programming (DP)*, such as Q-learning, Temporal Difference method, are introduced. They promise optimal performance of the agent in the Markov decision processes (MDPs), but in the non-MDPs, such as multi-agent domain, there is no guarantee for the convergence of agent's policy. On the other hand, *Profit-Sharing * is contrastive with DP-based ones, could guarantee the convergence to the rational policy, which means that agent could reach one of the desirable status, even in non-MDPs, where agents learn concurrently and competitively. Therefore, we embedded *Profit-Sharing* into the operator of crane to acquire cooperative rules in such a dynamic domain, and introduce its applicability to the realistic world by means of comparing with RAP (Reactive Action Planner) model, encoded by expert's knowledge.

- Publication
- IEICE TRANSACTIONS on Communications Vol.E83-B No.5 pp.1039-1047

- Publication Date
- 2000/05/25

- Publicized

- Online ISSN

- DOI

- Type of Manuscript
- Special Section PAPER (IEICE/IEEE Joint Special Issue on Autonomous Decentralized Systems)

- Category
- Real Time Control

The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.

Copy

Sachiyo ARAI, Kazuteru MIYAZAKI, Shigenobu KOBAYASHI, "Controlling Multiple Cranes Using Multi-Agent Reinforcement Learning: Emerging Coordination among Competitive Agents" in IEICE TRANSACTIONS on Communications,
vol. E83-B, no. 5, pp. 1039-1047, May 2000, doi: .

Abstract: This paper describes the *Profit-Sharing*, a reinforcement learning approach which can be used to design a coordination strategy in a multi-agent system, and demonstrates its effectiveness empirically within a coil-yard of steel manufacture. This domain consists of multiple cranes which are operated asynchronously but need coordination to adjust their initial plans of task execution to avoid the collisions, which would be caused by resource limitation. This problem is beyond the classical expert's hand-coding methods as well as the mathematical analysis, because of scattered information, stochastically generated tasks, and moreover, the difficulties to transact tasks on schedule. In recent few years, many applications of reinforcement learning algorithms based on *Dynamic Programming (DP)*, such as Q-learning, Temporal Difference method, are introduced. They promise optimal performance of the agent in the Markov decision processes (MDPs), but in the non-MDPs, such as multi-agent domain, there is no guarantee for the convergence of agent's policy. On the other hand, *Profit-Sharing * is contrastive with DP-based ones, could guarantee the convergence to the rational policy, which means that agent could reach one of the desirable status, even in non-MDPs, where agents learn concurrently and competitively. Therefore, we embedded *Profit-Sharing* into the operator of crane to acquire cooperative rules in such a dynamic domain, and introduce its applicability to the realistic world by means of comparing with RAP (Reactive Action Planner) model, encoded by expert's knowledge.

URL: https://global.ieice.org/en_transactions/communications/10.1587/e83-b_5_1039/_p

Copy

@ARTICLE{e83-b_5_1039,

author={Sachiyo ARAI, Kazuteru MIYAZAKI, Shigenobu KOBAYASHI, },

journal={IEICE TRANSACTIONS on Communications},

title={Controlling Multiple Cranes Using Multi-Agent Reinforcement Learning: Emerging Coordination among Competitive Agents},

year={2000},

volume={E83-B},

number={5},

pages={1039-1047},

abstract={This paper describes the *Profit-Sharing*, a reinforcement learning approach which can be used to design a coordination strategy in a multi-agent system, and demonstrates its effectiveness empirically within a coil-yard of steel manufacture. This domain consists of multiple cranes which are operated asynchronously but need coordination to adjust their initial plans of task execution to avoid the collisions, which would be caused by resource limitation. This problem is beyond the classical expert's hand-coding methods as well as the mathematical analysis, because of scattered information, stochastically generated tasks, and moreover, the difficulties to transact tasks on schedule. In recent few years, many applications of reinforcement learning algorithms based on *Dynamic Programming (DP)*, such as Q-learning, Temporal Difference method, are introduced. They promise optimal performance of the agent in the Markov decision processes (MDPs), but in the non-MDPs, such as multi-agent domain, there is no guarantee for the convergence of agent's policy. On the other hand, *Profit-Sharing * is contrastive with DP-based ones, could guarantee the convergence to the rational policy, which means that agent could reach one of the desirable status, even in non-MDPs, where agents learn concurrently and competitively. Therefore, we embedded *Profit-Sharing* into the operator of crane to acquire cooperative rules in such a dynamic domain, and introduce its applicability to the realistic world by means of comparing with RAP (Reactive Action Planner) model, encoded by expert's knowledge.},

keywords={},

doi={},

ISSN={},

month={May},}

Copy

TY - JOUR

TI - Controlling Multiple Cranes Using Multi-Agent Reinforcement Learning: Emerging Coordination among Competitive Agents

T2 - IEICE TRANSACTIONS on Communications

SP - 1039

EP - 1047

AU - Sachiyo ARAI

AU - Kazuteru MIYAZAKI

AU - Shigenobu KOBAYASHI

PY - 2000

DO -

JO - IEICE TRANSACTIONS on Communications

SN -

VL - E83-B

IS - 5

JA - IEICE TRANSACTIONS on Communications

Y1 - May 2000

AB - This paper describes the *Profit-Sharing*, a reinforcement learning approach which can be used to design a coordination strategy in a multi-agent system, and demonstrates its effectiveness empirically within a coil-yard of steel manufacture. This domain consists of multiple cranes which are operated asynchronously but need coordination to adjust their initial plans of task execution to avoid the collisions, which would be caused by resource limitation. This problem is beyond the classical expert's hand-coding methods as well as the mathematical analysis, because of scattered information, stochastically generated tasks, and moreover, the difficulties to transact tasks on schedule. In recent few years, many applications of reinforcement learning algorithms based on *Dynamic Programming (DP)*, such as Q-learning, Temporal Difference method, are introduced. They promise optimal performance of the agent in the Markov decision processes (MDPs), but in the non-MDPs, such as multi-agent domain, there is no guarantee for the convergence of agent's policy. On the other hand, *Profit-Sharing * is contrastive with DP-based ones, could guarantee the convergence to the rational policy, which means that agent could reach one of the desirable status, even in non-MDPs, where agents learn concurrently and competitively. Therefore, we embedded *Profit-Sharing* into the operator of crane to acquire cooperative rules in such a dynamic domain, and introduce its applicability to the realistic world by means of comparing with RAP (Reactive Action Planner) model, encoded by expert's knowledge.

ER -