October 27, 2021
We propose a hypothesis that effective policies can be learned from data without dynamic programming bootstrapping. To investigate this, we consider replacing traditional reinforcement learning (RL) algorithms -- which typically bootstrap against a learned value function -- with a simple sequence modeling objective. We train a transformer model on sequences of returns, states, and actions with an autoregressive prediction loss widely used in language modeling, reducing policy sampling to sequence generation. By training a transformer model using a supervised loss function, we can remove the need for dynamic programming bootstrapping, which is known to be unstable with function approximation. Furthermore, we can also leverage the simplicity, scalability, and long-range memory capabilities of transformers. Through experiments spanning a diverse set of offline RL benchmarks including Atari, OpenAI Gym, and Key-to-Door, we show that our Decision Transformer model can learn to generate diverse behaviors by conditioning on desired returns. In particular, our Decision Transformer, when conditioned with high desired returns, produces a policy that is competitive or better than state of the art model-free offline RL algorithms.
Written by
Lili Chen
Kevin Lu
Aravind Rajeswaran
Kimin Lee
Aditya Grover
Michael Laskin
Pieter Abbeel
Aravind Srinivas
Igor Mordatch
Publisher
NeurIPS
December 26, 2025
Anselm Paulus, Ilia Kulikov, Brandon Amos, Remi Munos, Ivan Evtimov, Kamalika Chaudhuri, Arman Zharmagambetov
December 26, 2025
December 01, 2025
Yun He, Wenzhe Li, Hejia Zhang, Vincent Li, Karishma Mandyam, Sopan Khosla, Yuanhao Xiong, Nanshu Wang, Selina Xiaoliang Peng, Shengjie Bi, Shishir G. Patil, Qi Qi, Shengyu Feng, Julian Katz-Samuels, Richard Yuanzhe Pang, Sujan Gonugondla, Hunter Lang, Yue Yu, Yundi Qian, Maryam Fazel-Zarandi, Licheng Yu, Amine Benhalloum, Hany Awadalla, Manaal Faruqui
December 01, 2025
November 18, 2025
Shalini Maiti *, Amar Budhiraja *, Bhavul Gauri, Gaurav Chaurasia, Anton Protopopov, Alexis Audran-Reiss, Michael Slater, Despoina Magka, Tatiana Shavrina, Roberta Raileanu, Yoram Bachrach, * Equal authorship
November 18, 2025
October 13, 2025
Chenyu Wang, Paria Rashidinejad, DiJia Su, Song Jiang, Sid Wang, Siyan Zhao, Cai Zhou, Shannon Zejiang Shen, Feiyu Chen, Tommi Jaakkola, Yuandong Tian, Bo Liu
October 13, 2025

Our approach
Latest news
Foundational models