February 27, 2025
We present a novel approach to formalise and solve search-based problems using large language models, which significantly improves upon previous state-of-the-art results. We demonstrate the efficacy of this approach on the logic puzzles benchmark ZebraLogicBench. Instead of letting the LLM attempt to directly solve the puzzles, our method prompts the model to formalise the problem in a logic-focused domain-specific language (DSL) called Logic.py. This formalised representation is then solved using a constraint solver, leveraging the strengths of both the language model and the solver. Our approach achieves a remarkable 65% absolute improvement over the baseline performance of Llama 3.1 70B on ZebraLogicBench, setting a new state-of-the-art with an accuracy of over 90%. This significant advancement demonstrates the potential of combining language models with domain-specific languages and auxiliary tools on traditionally challenging tasks for LLMs.
May 26, 2026
Valentin Wyart, Huy V. Vo, Jean Remi King, Josephine Raugel, Jérémy Rapin, Marc Szafraniec, Max Seitzer, Patrick Labatut, Piotr Bojanowski
May 26, 2026
September 08, 2025
Rohit Patel
September 08, 2025
June 13, 2025
Nastaran Okati, Daniel Haimovich, Fridolin Linder, Ido Guy, Lorenzo Perini, Mark Tygert, Niek Tax
June 13, 2025
March 25, 2025
El Mahdi El Mhamdi, Nicolas Usunier, Wassim (Wes) Bouaziz
March 25, 2025

Our approach
Latest news
Foundational models