Research areas
- Reinforcement learning theory and algorithms
- Temporal-difference learning and policy evaluation
- Off-policy learning, emphatic TD, and convergence analysis
- General value functions, predictive representations, and the Horde architecture
- Task specification and unification of RL formalisms
Key papers
- white-2017-unifying-task-specification-rl — single-author ICML 2017 paper introducing the RL task formalism with transition-based discounting; unifies episodic/continuing tasks, options, and general value functions under one Bellman contraction analysis.
Recent work
(populated as additional papers are ingested)
Collaborators
- Richard S. Sutton (frequent co-author on emphatic TD and off-policy TD analysis)
- Ashique Rupam Mahmood (emphatic TD)
- Adam White (general value functions, predictive knowledge)
- Hado van Hasselt (acknowledged in White 2017 for discussions on transition-based and probabilistic discounts)
My notes
White’s body of work centers on giving rigorous convergence treatment to off-policy temporal-difference algorithms in increasingly general task settings. The 2017 ICML paper is a representative example: a clean, single-author contribution that simplifies a piece of formalism (the type signature of γ) and propagates the simplification through the standard Bellman contraction proof. At the time of ingest she was at Indiana University; she has since moved to the University of Alberta and is a fellow at Amii.