Representative Action Selection for Large Action Space Meta-Bandits

Quan Zhou, Mark Kozdoba, Shie Mannor

Published: 2025/5/23

Abstract

We study the problem of selecting a subset from a large action space shared by a family of bandits, with the goal of achieving performance nearly matching that of using the full action space. We assume that similar actions tend to have related payoffs, modeled by a Gaussian process. To exploit this structure, we propose a simple epsilon-net algorithm to select a representative subset. We provide theoretical guarantees for its performance and compare it empirically to Thompson Sampling and Upper Confidence Bound.

Read Full Paper (arXiv.org)