I am interested in reinforcement learning, language agents & reasoning, sampling techniques, and contextual bandits.