Purple Llama CyberSecEval: A benchmark for evaluating the cybersecurity risks of large language models

December 07, 2023


This paper presents CYBERSECEVAL, a comprehensive benchmark developed to help bolster the cybersecurity of Large Language Models (LLMs) employed as coding assistants. As what we believe to be the most extensive unified cybersecurity safety benchmark to date, CYBERSECEVAL provides a thorough evaluation of LLMs in two crucial security domains: their propensity to generate insecure code and their level of compliance when asked to assist in cyberattacks. Through a case study involving seven models from the Llama2, codeLlama, and OpenAI GPT large language model families, CYBERSECEVAL effectively pinpointed key cybersecurity risks. More importantly, it offered practical insights for refining these models. A significant observation from the study was the tendency of more advanced models to suggest insecure code, highlighting the critical need for integrating security considerations in the development of sophisticated LLMs. CYBERSECEVAL, with its automated test case generation and evaluation pipeline covers a broad scope and equips LLM designers and researchers with a tool to broadly measure and enhance the cybersecurity safety properties of LLMs, contributing to the development of more secure AI systems.

Download the Paper


Written by

GenAI Cybersec Team

Manish Bhatt

Sahana Chennabasappa

Cyrus Nikolaidis

Shengye Wan

Ivan Evtimov

Dominik Gabi

Daniel Song

Faizan Ahmad

Cornelius Aschermann

Lorenzo Fontana

Sasha Frolov

Ravi Prakash Giri

Dhaval Kapil

Yiannis Kozyrakis

David LeBlanc

James Milazzo

Aleksandar Straumann

Gabriel Synnaeve

Varun Vontimitta

Spencer Whitman

Joshua Saxe



Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.