martha-white

Research areas

Reinforcement learning theory and algorithms
Temporal-difference learning and policy evaluation
Off-policy learning, emphatic TD, and convergence analysis
General value functions, predictive representations, and the Horde architecture
Task specification and unification of RL formalisms

Key papers

white-2017-unifying-task-specification-rl — single-author ICML 2017 paper introducing the RL task formalism with transition-based discounting; unifies episodic/continuing tasks, options, and general value functions under one Bellman contraction analysis.

Recent work

(populated as additional papers are ingested)

Collaborators

Richard S. Sutton (frequent co-author on emphatic TD and off-policy TD analysis)
Ashique Rupam Mahmood (emphatic TD)
Adam White (general value functions, predictive knowledge)
Hado van Hasselt (acknowledged in White 2017 for discussions on transition-based and probabilistic discounts)

My notes

White’s body of work centers on giving rigorous convergence treatment to off-policy temporal-difference algorithms in increasingly general task settings. The 2017 ICML paper is a representative example: a clean, single-author contribution that simplifies a piece of formalism (the type signature of γ) and propagates the simplification through the standard Bellman contraction proof. At the time of ingest she was at Indiana University; she has since moved to the University of Alberta and is a fellow at Amii.

LeatherSagiKnowledgebase

Explorer

martha-white

Research areas

Key papers

Recent work

Collaborators

My notes

Graph View

Table of Contents

Backlinks