ML Applications

CompilerGym: Making compiler optimizations accessible to all

September 30, 2021

What it is

We are releasing CompilerGym, a library of high-performance, easy-to-use reinforcement learning environments for compiler optimization tasks. Built by Facebook AI on OpenAI Gym, CompilerGym provides powerful tools that enable the machine learning (ML) research community to address production compiler optimization problems using a familiar language and vocabulary.

CompilerGym packages important compiler optimization problems and makes them look like reinforcement learning problems. The compiler optimization problems we include are large in scale. For example, for one, the search space is 104461, considerably larger than that of the board game Go. For another, the search space is infinite. ​​Advances on problems of such scale are possible for the first time only because of very recent advances in reinforcement learning. CompilerGym makes it easy for anyone with an ML or compiler background to dive right in and start solving the problems, all without the months of tedious setup time that would normally be required. And that’s because we’ve done it for you!

In our first releases, we include reinforcement learning environments for three compiler problems: phase ordering using LLVM, flag tuning using GCC, and loop nest generation using CUDA. We also provide large data sets of programs for training, scripts to verify reproducibility of results, public leaderboards, and a web front end. Over time, we plan to offer support for other well-established compiler problems, including register allocation, peephole optimization, and loop optimizations. We also expect to add further tasks, rewards, observations, and actions that we hope will bring the compiler and ML research communities closer together.

Our goal is to be a catalyst for using ML to make compilers faster, which is important as poorly optimized programs are slow and consume too many computing resources as well as too much energy, limiting applications of energy-efficient edge devices and making data centers less environmentally friendly.


We designed CompilerGym to make building ML models for compiler research problems as easy as for playing video games. Highlights of the library include:

  • API: We use OpenAI’s Gym interface so you can write your agent with Python.

  • Tasks and Actions: We provide environments for phase ordering in LLVM, flag tuning in GCC, and GPU loop nest optimization using loop_tool.

  • Data sets: We provide thousands of real-world programs for use in training and evaluating agents, covering a range of programming languages and domains.

  • Representations: We provide raw representations of programs and multiple precomputed features, allowing you to focus on end-to-end deep learning or features and boosted trees, all the way up to graph models.

  • Rewards: We support optimizing for runtime and code size out of the box.

  • Testing: We provide a validation process to ensure that results are reproducible.

  • Baselines: We offer numerous baseline algorithms and report their performance.

  • Competition: Submit your results and view them on community leaderboards.

  • Accessibility: We provide a suite of command line tools for interacting with the environments without having to write any code, as well as an interactive web front end that enables you to explore the optimization spaces through the browser.

How it works

We expose our compiler optimization problems as Gym environments, with each representing a specific problem. Each environment offers a set of observation spaces, reward signals, and action spaces that are appropriate for the specific compiler optimization problem. Our observations are representations of a program being compiled, and we provide rewards to indicate when the agent improves code quality. These environments can be used in the same way as other Gym environments. For example, they can be seamlessly integrated with libraries such as RLlib, as demonstrated here. This is all the code required to run a random walk of the optimization space for LLVM phase ordering:

import compiler_gym
import gym
env = gym.make("llvm-autophase-ic-v0")  # pre-packaged compiler environment
observation = env.reset()
for _ in range(1000):
  action = env.action_space.sample()  # your agent here (this takes random actions)
observation, reward, done, info = env.step(action)
if done:
  observation = env.reset()

At each step of the LLVM phase ordering environment discussed above, the agent must choose an optimization pass to run next from a set of 123 distinct optimizations. One example action is to ask the compiler to run -dce, an LLVM pass for dead code elimination, which removes unreachable code from the program. The LLVM environment supports optimizing for runtime, compiler binary size, and instruction count. To get an intuitive sense of the action space, we compute the instruction count rewards attributable to each action through random trials over a set of programs.

The above graph offers insights into the dynamics of the compiler: While the -reg2mem pass appears to give the least reward, increasing the instruction count, the -mem2reg and -sroa passes seem to give the agent maximum reward. This fits with an understanding of LLVM’s internals, in which -reg2mem and -mem2reg are symmetric passes responsible for demoting or promoting memory accesses to registers.

At the same time, -instsimplify seems to reduce program size by removing redundant instructions, and -float2int seems to consistently give no reward. This makes sense since it’s simply rewriting instructions, not adding or removing them.

We want researchers in the community to compare and publicize their results on CompilerGym environments, so we created leaderboards that show the inference time and rewards found by simple search techniques. This is the initial LLVM instruction count leaderboard, and you can find the updated version here.

AuthorAlgorithmDateWall time (mean)Code size reduction (geomean)
FacebookRandom agent (t=10800)2021-0310,512.356s1.062x
FacebookRandom agent (t=3600)2021-033,630.821s1.061x
FacebookGreedy search2021-03169.237s1.055x
FacebookRandom agent (t=60)2021-0391.215s1.045x
FacebookEpsilon-greedy search (e=0.1)2021-03152.579s1.041x
FacebookRandom agent (t=10)2021-0342.939s1.031x
Your name hereYour technique here

If you’d like your results included, we’d be delighted to showcase them there. Once you train an agent, please use the provided tools to validate its correctness and then send us a pull request adding an entry to this table. You can review our contributing guide for instructions.

Why it matters

Compilers are an essential part of the computing stack, as they translate programs humans write into executable binaries. But all compilers utilize a huge number of human-created heuristics when trying to optimize these programs. As a result, there is a large gap between what people write and the best possible solution. And while ML can help bridge the gap, few ML practitioners want to become compiler experts.

With CompilerGym, we provide ML practitioners with the tools to improve on compilers’ optimizations without knowing anything about their internals or having to fiddle about with low-level C++ code. Our new library takes care of all of that.

Instead of relying solely on human experts, we think machines can learn how to optimize code. Thanks to recent advances in deep learning and reinforcement learning, we foresee a time when machines can be far more efficient at tuning programs for performance and energy. If we’re successful, data centers will use less energy, mobile phones and computers will run faster and cooler, and we will be able to do more with less. This is an outcome that we believe will benefit society, and our goal is to accelerate progress toward this future.

Get it on GitHub

Check out our paper, try the Getting Started notebook, and get the code on GitHub.

Written By

Research Engineer

Hugh Leather

Research Scientist

Soumith Chintala

Software Engineer