Reinforcement Learni9ng

Human & Machine Intelligence

Human-Level Performance in No-Press Diplomacy via Equilibrium Search

May 3, 2021

Abstract

Prior AI breakthroughs in complex games have focused on either the purely adversarial or purely cooperative settings. In contrast, Diplomacy is a game of shifting alliances that involves both cooperation and competition. For this reason, Diplomacy has proven to be a formidable research challenge. In this paper we describe an agent for the no-press variant of Diplomacy that combines supervised learning on human data with one-step lookahead search via regret minimization. Regret minimization techniques have been behind previous AI successes in adversarial games, most notably poker, but have not previously been shown to be successful in large-scale games involving cooperation. We show that our agent greatly exceeds the performance of past no-press Diplomacy bots, is unexploitable by expert humans, and ranks in the top 2% of human players when playing anonymous games on a popular Diplomacy website.

Download the Paper

AUTHORS

Written by

Jonathan Gray

Adam Lerer

Anton Bakhtin

Noam Brown

Publisher

ICLR 2021

Related Publications

December 05, 2020

Robotics

Reinforcement Learni9ng

Neural Dynamic Policies for End-to-End Sensorimotor Learning

Deepak Pathak, Abhinav Gupta, Mustafa Mukadam, Shikhar Bahl

December 05, 2020

December 07, 2020

Reinforcement Learni9ng

Joint Policy Search for Collaborative Multi-agent Imperfect Information Games

Yuandong Tian, Qucheng Gong, Tina Jiang

December 07, 2020

March 13, 2021

Reinforcement Learni9ng

On the Importance of Hyperparameter Optimization for Model-based Reinforcement Learning

Baohe Zhang, Raghu Rajan, Luis Pineda, Nathan Lambert, Andre Biedenkapp, Kurtland Chua, Frank Hutter, Roberto Calandra

March 13, 2021

October 10, 2020

Computer Vision

Reinforcement Learni9ng

Active MR k-space Sampling with Reinforcement Learning

Luis Pineda, Sumana Basu, Adriana Romero,Roberto CalandraRoberto Calandra, Michal Drozdzal

October 10, 2020

December 05, 2020

Reinforcement Learni9ng

An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits

Andrea Tirinzonin, Matteo Pirotta, Marcello Restelli, Alessandro Lazaric

December 05, 2020

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.