Classical Bandit Algorithms to the Structured Bandit Setting

Summary

  • Proposed algorithm allows generalization of bandit algorithms such as UCB and Thompson sampling to the structured bandit setting
  • Approach is based on using the structure to identify sub-optimal arms, resulting in a significant reduction in cumulative regret
  • Implemented the algorithms on the MovieLens dataset (U Minnesota) to emprically verify the theorectical results

Related