RESEARCH

NLP

CWM: An Open-Weights LLM for Research on Code Generation with World Models

September 24, 2025

Abstract

We release Code World Model (CWM), a 32-billion-parameter open-weights LLM, to advance research on code generation with world models. To improve code understanding beyond what can be learned from training on static code alone, we mid-train CWM on a large amount of observation-action trajectories from Python interpreter and agentic Docker environments, and perform extensive multi- task reasoning RL in verifiable coding, math, and multi-turn software engineering environments. With CWM, we provide a strong testbed for researchers to explore the opportunities world modeling affords for improving code generation with reasoning and planning in computational environments. We present first steps of how world models can benefit agentic coding, enable step-by-step simulation of Python code execution, and show early results of how reasoning can benefit from the latter. CWM is a dense, decoder-only LLM trained with a context size of up to 131 k tokens. Independent of its world modeling capabilities, CWM offers strong performance on general coding and math tasks: it reaches pass@1 scores of 65.8 % on SWE-bench Verified (with test-time scaling), 68.6 % on LiveCodeBench, 96.6 % on Math-500, and 76.0 % on AIME 2024. To support further research on code world modeling, we release model checkpoints after mid-training, SFT, and RL.

Download the Paper

AUTHORS

Written by

Chris Cummins

Hugh Leather

Aram Markosyan

Matteo Pagliardini

Tal Remez

Volker Seeker

Marco Selvi

Lingming Zhang

Abhishek Charnalia

Alex Gu

Badr Youbi Idrissi

Christian Keller

Daniel Haziza

David Zhang

Dmitrii Pedchenko

Emily McMilin

Fabian Gloeckle

Felix Kreuk

Francisco Massa

François Fleuret

Gabriel Synnaeve

Gal Cohen

Gallil Maimon

Jacob Kahn

Jade Copet

Jannik Kossen

Jonas Gehring

Jordi Armengol-Estape

Juliette Decugis

Keyur Muzumdar

Kunhao Zheng

Luca Wehrstedt

Maximilian Beck

Michael Hassid

Michel Meyer

Naila Murray

Oren Sultan

Ori Yoran

Pedram Bashiri

Peter O'Hearn

Pierre Chambon

Pierre-Emmanuel Mazaré

Quentin Carbonneaux

Rahul Kindi

Sida Wang

Taco Cohen

Vegard Mella

Yossi Adi

Yuxiang Wei

Zacharias Fisches

Publisher

arXiv

Research Topics

Natural Language Processing (NLP)

Core Machine Learning

Related Publications

May 12, 2026

HUMAN & MACHINE INTELLIGENCE

RESEARCH

NeuralSet: A High-Performing Python Package for Neuro-AI

Corentin Bel, Linnea Evanson, Julien Gadonneix, Andrea Santos Revilla, Mingfang (Lucy) Zhang, Julie Bonnaire, Charlotte Caucheteux, Alexandre Défossez, Théo Desbordes, Pablo Diego-Simón, Shubh Khanna, Juliette Millet, Pierre Orhan, Saarang Panchavati, Antoine Ratouchniak, Alexis Thual, Hubert Jacob Banville, Jarod Levy, Jean Remi King, Josephine Raugel, Jérémy Rapin, Katelyn Begany, Marlene Careil, Simon Dahan, Sophia Houhamdi, Stéphane d'Ascoli, Teon Brooks, Yohann Benchetrit

May 12, 2026

May 06, 2026

HUMAN & MACHINE INTELLIGENCE

RESEARCH

NeuralBench: A Unifying Framework to Benchmark NeuroAI Models

Saarang Panchavati, Antoine Ratouchniak, Mingfang (Lucy) Zhang, Elisa Cascardi, Hubert Banville, Jarod Levy, Jean-Rémi King, Jérémy Rapin, Katelyn Begany, Marlene Careil, Simon Dahan, Stéphane d'Ascoli, Teon Brooks, Yohann Benchetrit

May 06, 2026

May 04, 2026

NLP

Compute Optimal Tokenization

Sachin Mehta, Alisa Liu, Margaret Li, Artidoro Pagnoni, Gargi Ghosh, Luke Zettlemoyer, Mike Lewis, Srini Iyer, Tomasz Limisiewicz

May 04, 2026

April 16, 2026

RESEARCH

AIRA₂: Overcoming Bottlenecks in AI Research Agents

Nicola Cancedda, Pontus Stenetorp, Alexis Audran-Reiss, Alisia Lupidi, Anton Protopopov, Bassel Al Omari, Carole-Jean Wu, Derek Dunfield, Despoina Magka, Edan Toledo, Hela Momand, Ishita Mediratta, Jakob Foerster, Jean-Christophe Gagnon-Audet, Karen Hambardzumyan, Kelvin Niu, Martin Josifoski, Michael Kuchnik, Michael Shvartsman, Nicolas Baldwin, Parth Pathak, Rishi Hazra, Tatiana Shavrina, Thomas Simon Foster, Yoram Bachrach

April 16, 2026

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.