We propose a policy UCB-SC for budgeted multi-armed bandits. The policy is a variant of recently proposed KL-UCB-SC. Unlike KL-UCB-SC, which is computationally prohibitive, UCB-SC runs very fast while keeping KL-UCB-SC's asymptotical optimality when reward and cost distributions are Bernoulli with means around 0.5, which are verified both theoretically and empirically.
Ryo WATANABE
Hokkaido University
Junpei KOMIYAMA
the University of Tokyo
Atsuyoshi NAKAMURA
Hokkaido University
Mineichi KUDO
Hokkaido University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Ryo WATANABE, Junpei KOMIYAMA, Atsuyoshi NAKAMURA, Mineichi KUDO, "UCB-SC: A Fast Variant of KL-UCB-SC for Budgeted Multi-Armed Bandit Problem" in IEICE TRANSACTIONS on Fundamentals,
vol. E101-A, no. 3, pp. 662-667, March 2018, doi: 10.1587/transfun.E101.A.662.
Abstract: We propose a policy UCB-SC for budgeted multi-armed bandits. The policy is a variant of recently proposed KL-UCB-SC. Unlike KL-UCB-SC, which is computationally prohibitive, UCB-SC runs very fast while keeping KL-UCB-SC's asymptotical optimality when reward and cost distributions are Bernoulli with means around 0.5, which are verified both theoretically and empirically.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/transfun.E101.A.662/_p
Copy
@ARTICLE{e101-a_3_662,
author={Ryo WATANABE, Junpei KOMIYAMA, Atsuyoshi NAKAMURA, Mineichi KUDO, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={UCB-SC: A Fast Variant of KL-UCB-SC for Budgeted Multi-Armed Bandit Problem},
year={2018},
volume={E101-A},
number={3},
pages={662-667},
abstract={We propose a policy UCB-SC for budgeted multi-armed bandits. The policy is a variant of recently proposed KL-UCB-SC. Unlike KL-UCB-SC, which is computationally prohibitive, UCB-SC runs very fast while keeping KL-UCB-SC's asymptotical optimality when reward and cost distributions are Bernoulli with means around 0.5, which are verified both theoretically and empirically.},
keywords={},
doi={10.1587/transfun.E101.A.662},
ISSN={1745-1337},
month={March},}
Copy
TY - JOUR
TI - UCB-SC: A Fast Variant of KL-UCB-SC for Budgeted Multi-Armed Bandit Problem
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 662
EP - 667
AU - Ryo WATANABE
AU - Junpei KOMIYAMA
AU - Atsuyoshi NAKAMURA
AU - Mineichi KUDO
PY - 2018
DO - 10.1587/transfun.E101.A.662
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E101-A
IS - 3
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - March 2018
AB - We propose a policy UCB-SC for budgeted multi-armed bandits. The policy is a variant of recently proposed KL-UCB-SC. Unlike KL-UCB-SC, which is computationally prohibitive, UCB-SC runs very fast while keeping KL-UCB-SC's asymptotical optimality when reward and cost distributions are Bernoulli with means around 0.5, which are verified both theoretically and empirically.
ER -