You Can't Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments

Paster, Keiran; McIlraith, Sheila; Ba, Jimmy

Computer Science > Machine Learning

arXiv:2205.15967 (cs)

[Submitted on 31 May 2022 (v1), last revised 28 Nov 2022 (this version, v2)]

Title:You Can't Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments

Authors:Keiran Paster, Sheila McIlraith, Jimmy Ba

View PDF

Abstract:Recently, methods such as Decision Transformer that reduce reinforcement learning to a prediction task and solve it via supervised learning (RvS) have become popular due to their simplicity, robustness to hyperparameters, and strong overall performance on offline RL tasks. However, simply conditioning a probabilistic model on a desired return and taking the predicted action can fail dramatically in stochastic environments since trajectories that result in a return may have only achieved that return due to luck. In this work, we describe the limitations of RvS approaches in stochastic environments and propose a solution. Rather than simply conditioning on the return of a single trajectory as is standard practice, our proposed method, ESPER, learns to cluster trajectories and conditions on average cluster returns, which are independent from environment stochasticity. Doing so allows ESPER to achieve strong alignment between target return and expected performance in real environments. We demonstrate this in several challenging stochastic offline-RL tasks including the challenging puzzle game 2048, and Connect Four playing against a stochastic opponent. In all tested domains, ESPER achieves significantly better alignment between the target return and achieved return than simply conditioning on returns. ESPER also achieves higher maximum performance than even the value-based baselines.

Comments:	Added experiments with Decision Transformers; Fixed error in Theorem 2.1; Updated related works; Added link for code
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2205.15967 [cs.LG]
	(or arXiv:2205.15967v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2205.15967

Submission history

From: Keiran Paster [view email]
[v1] Tue, 31 May 2022 17:15:44 UTC (870 KB)
[v2] Mon, 28 Nov 2022 01:36:49 UTC (1,134 KB)

Computer Science > Machine Learning

Title:You Can't Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:You Can't Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators