INTEGRITY

THEORY

Logic.py: Bridging the Gap between LLMs and Constraint Solvers

February 27, 2025

Abstract

We present a novel approach to formalise and solve search-based problems using large language models, which significantly improves upon previous state-of-the-art results. We demonstrate the efficacy of this approach on the logic puzzles benchmark ZebraLogicBench. Instead of letting the LLM attempt to directly solve the puzzles, our method prompts the model to formalise the problem in a logic-focused domain-specific language (DSL) called Logic.py. This formalised representation is then solved using a constraint solver, leveraging the strengths of both the language model and the solver. Our approach achieves a remarkable 65% absolute improvement over the baseline performance of Llama 3.1 70B on ZebraLogicBench, setting a new state-of-the-art with an accuracy of over 90%. This significant advancement demonstrates the potential of combining language models with domain-specific languages and auxiliary tools on traditionally challenging tasks for LLMs.

Download the Paper

AUTHORS

Written by

Pascal Kesseli

Peter O'Hearn

Ricardo Silveira Cabral

Publisher

NeurIPS 2025

Research Topics

Theory

Integrity

Related Publications

May 26, 2026

HUMAN & MACHINE INTELLIGENCE

THEORY

Misalignment Between Backpropagation and the Hierarchy of Brain Responses to Images

Valentin Wyart, Huy V. Vo, Jean Remi King, Josephine Raugel, Jérémy Rapin, Marc Szafraniec, Max Seitzer, Patrick Labatut, Piotr Bojanowski

May 26, 2026

September 08, 2025

THEORY

REINFORCEMENT LEARNING

Understanding Reinforcement Learning for Model Training, and future directions with GRAPE

Rohit Patel

September 08, 2025

June 13, 2025

FAIRNESS

INTEGRITY

Measuring multi-calibration

Nastaran Okati, Daniel Haimovich, Fridolin Linder, Ido Guy, Lorenzo Perini, Mark Tygert, Niek Tax

June 13, 2025

March 25, 2025

INTEGRITY

SPEECH & AUDIO

Targeted Data Poisoning for Black-Box Audio Datasets Ownership Verification

El Mahdi El Mhamdi, Nicolas Usunier, Wassim (Wes) Bouaziz

March 25, 2025

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.