Linear Bandits on Uniformly Convex Sets.
TITLE: Linear Bandits on Uniformly Convex Sets.
AUTHORS: T. Kerdreux, C. Roux, A. d'Aspremont, S. Pokutta
ABSTRACT: Linear bandit algorithms yield two
types of structural assumptions lead to better pseudo-regret bounds. When
K is the simplex or an l_p ball with p in ]1,2], there exist
bandits algorithms with O(sqrt{nT}) pseudo-regret bounds.
Here, we derive bandit algorithms for some strongly convex sets beyond l_p
balls that enjoy pseudo-regret bounds of O(sqrt{nT}), which answers an open question from (BCB12, S 5.5). Interestingly, when the
action set is uniformly convex but not necessarily strongly convex, we obtain pseudo-regret bounds with a dimension dependency smaller than
O(sqrt{n}). However, this comes at the expense of asymptotic rates in T varying between O(sqrt{T}) and O(T).
ArXiv PREPRINT: 2103.05907
|