Prediction and Generative Modeling in Reinforcement Learning

July 15, 2018

Abstract

Model-based methods in machine learning aim to speed up the learning process by exploiting an explicit representation of the underlying model. In Reinforcement Learning (RL), classic model-based approaches leverage the available samples to construct an estimate of the underlying environment. There are several advantages in having an explicit model representation: I) learning behaviors is usually faster and more sample efficient; II) prior knowledge and experience can be integrated more easily; III) the model can be flexibly reused for a wide variety of goals and objectives. Model-based approaches can also enable counterfactual reasoning (“what would have happened if …”) which is exceedingly difficult without a model (e.g., using value-based approaches). They also enable easier transfer learning when reward (and to some extent the dynamics) changes. More generally, the ability to build an internal representation of the environment can be viewed as a hallmark of intelligence. Indeed, prediction and intuitive physics often figure in neuroscience, psychology, and cognitive science research into the development of internal representations in the human brain. Finally, model-based methods are of substantial theoretical and practical importance, since they have shown to be able to learn faster in large continuous environments, to provide insights in the way humans behave, and to be at the core theoretically efficient/optimal methods for exploration-exploitation in discrete domains. However, model-based algorithms are not without their challenges: constructing accurate models in complex real-world environments can be difficult, and imperfect models can give rise to highly suboptimal behavior. Although recent years have seen substantial advances in generative modeling, prediction, image generation, and other types of forecasting applications, many of these advances have yet to produce a large impact on reinforcement learning and control. The aim of this workshop will be to investigate questions in model-based reinforcement learning, as well as how tools and ideas from other generative modelling and prediction fields can influence the development of novel decision making and control algorithms. For example, can generative adversarial networks provide an answer to the question of which loss function should be used to fit a model? How can model-free reinforcement learning ideas influence model-based learning while benefiting from its improved efficiency and flexibility? Can we design hybrid approaches that integrates model-free and model-based learning? how can the best innovations in prediction and time series modelling translate into improved reinforcement learning algorithms? Alongside that, we encourage submission on any topic related to core model-based reinforcement learning. Some of the open questions are: How can we exploit side information? Is it possible to design algorithms for optimal exploration-exploitation in large domains? How can we incorporate safety in model-based approaches? What are the current limits of model-based approaches and what can we expect in the future? Which are the classes of environments we are able to represent (e.g., MDP, POMDPS and PSR)? And what are the suited models (e.g., NN, RNN)? Can we design efficient (even theoretically) approaches for particular classes of problems (e.g., linearly-solvable MDPs or linear-quadratic regulator)? We will invite speakers and solicit contributed papers and posters spanning the entire rangefrom core model-based reinforcement learning to generative models research, as well as speakers with a focus on human learning. We expect that the presence of these different communities will result in a fruitful exchange of ideas and stimulate an open discussion about the current challenges in model-based learning, as well as possible solutions. In terms of prospective participants, our main targets are machine learning researchers interested in the processes related to understanding and improving current algorithms. Specific target communities within machine learning include, but are not limited to optimization, deep learning and reinforcement learning. Our invited speakers also include researchers who study human learning, to provide a broader perspective to the attendees.

Download the Paper

AUTHORS

Written by

Alessandro Lazaric

Martin Riedmiller

Matteo Pirotta

Roberto Calandra

Sergey Levine

Publisher

ICML

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.