Research areas

  • Reinforcement learning theory and algorithms
  • Temporal-difference learning and policy evaluation
  • Off-policy learning, emphatic TD, and convergence analysis
  • General value functions, predictive representations, and the Horde architecture
  • Task specification and unification of RL formalisms

Key papers

  • white-2017-unifying-task-specification-rl — single-author ICML 2017 paper introducing the RL task formalism with transition-based discounting; unifies episodic/continuing tasks, options, and general value functions under one Bellman contraction analysis.

Recent work

(populated as additional papers are ingested)

Collaborators

  • Richard S. Sutton (frequent co-author on emphatic TD and off-policy TD analysis)
  • Ashique Rupam Mahmood (emphatic TD)
  • Adam White (general value functions, predictive knowledge)
  • Hado van Hasselt (acknowledged in White 2017 for discussions on transition-based and probabilistic discounts)

My notes

White’s body of work centers on giving rigorous convergence treatment to off-policy temporal-difference algorithms in increasingly general task settings. The 2017 ICML paper is a representative example: a clean, single-author contribution that simplifies a piece of formalism (the type signature of γ) and propagates the simplification through the standard Bellman contraction proof. At the time of ingest she was at Indiana University; she has since moved to the University of Alberta and is a fellow at Amii.