CraftAssist, a platform for collaborative AI bots in Minecraft

July 18, 2019

What the research is:

A platform for implementing AI assistants that can collaborate with human players in the sandbox construction game Minecraft. The assistants can move, place or destroy blocks, and spawn mobs; and can communicate with human players through text-based chat. By combining language, perception, memory, and physical actions in the game, these bots can carry out complex tasks, such as building a house. Developing these skills could help move researchers closer to broader human-AI collaboration. The platform is intended to support the study of agents that are fun to interact with and useful for a wide variety of tasks specified and evaluated by human participants. To encourage the wider AI research community to use the CraftAssist platform for their own experiments, we are open-sourcing the framework, as well as a baseline assistant and the tools and data we used to build it. The release includes sequential step-by-step data on human players building houses in Minecraft, semantic segmentation data for those houses, and a large-scale natural language semantic parsing dataset.

How it works:

Players can interact with these assistants through a standard Minecraft client, with multiple human players and bots potentially engaging each other in the same game. CraftAssist uses the open source, extensible Minecraft-compatible game server Cuberite, allowing researchers to record game sessions (including all in-game language and actions) to generate their own training data. The baseline assistant we’ve included with CraftAssist was trained on a dataset of human interactions with previous versions of the assistant, as well as crowdsourced examples of human participants building more than 2,500 different houses within Minecraft. Both of these training sets are available in this open source release.

The design of our baseline assistant is modular, allowing researchers to use the complete bot for their experiments or to focus on individual components related to memory, perception, and language understanding. These modules work in concert to carry out tasks. For example, if a human asks the bot to “build a house next to the blue cube,” the language understanding module — a neural semantic parser — uses that chat input to produce a program over high-level action primitives. It determines the high-level actions requested, such as a “move” to the specified destination followed by a “build.” The memory module then queries the bot’s memory for stored objects, such as objects tagged “blue” and “cube,” and creates a move task targeting the coordinates of the blue cube. This puts the bot in position to start building a house using a sequence of “move” and “build” tasks, based on the house-building examples from our dataset.

Why it matters:

In the long term, this work is intended to help make AI assistants more broadly capable — and, in particular, more flexible. Though machine learning (ML) methods have achieved impressive performance on difficult but narrowly defined tasks, building systems that perform well at a large variety of tasks, especially tasks specified by humans (sometimes ambiguously) using language, remains an important challenge. Our position paper details our motivation for studying these systems through Minecraft (the white paper linked below focuses on technical aspects of the framework). The lessons learned from explorations in Minecraft could help lead to AI assistants that can better interact and collaborate with humans across a wide variety of real-world scenarios, actively learning new concepts and skills through those interactions.

Read the full paper:

CraftAssist: A framework for dialogue-enabled interactive bots

Written By

Kavya Srinet

Research Engineer, Facebook AI

Jonathan Gray

Research Engineer, Facebook AI

Larry Zitnick

Research Scientist, Facebook AI

Arthur Szlam

Research Scientist, Facebook AI