Sanket Shah
Bio
Publications
Posts
Contact
Restless Multi-Armed Bandits
Decision-Focused Learning in Restless Multi-Armed Bandits with Application to Maternal and Child Care Domain
We propose a way to differentiate through MDP Planning for Restless Multi-Armed Bandits. We use this approach to better learn the Transition Matrices from "features" associated with different arms using Decision-Focused Learning.
Q-Learning Lagrange Policies for Multi-Action Restless Bandits
We propose two online model-free algorithms to learn the Whittle Index associated with *multi-action* Restless Multi-Armed Bandits.
Cite
×