July 23, 2024
We are releasing a new suite of security benchmarks for LLMs, CYBERSECEVAL 3, to continue the conversation on empirically measuring LLM cybersecurity risks and capabilities. CYBERSECEVAL 3 assesses 8 different risks across two broad categories: risk to third parties, and risk to application developers and end users. Compared to previous work, we add new areas focused on offensive security capabilities: automated social engineering, scaling manual offensive cyber operations, and autonomous offensive cyber operations. In this paper we discuss applying these benchmarks to the Llama 3 models and a suite of contemporaneous state-of-the-art LLMs, enabling us to contextualize risks both with and without mitigations in place.
Written by
Shengye Wan
Cyrus Nikolaidis
Daniel Song
David Molnar
James Crnkovich
Jayson Grace
Manish Bhatt
Sahana Chennabasappa
Spencer Whitman
Stephanie Ding
Vlad Ionescu
Yue Li
Joshua Saxe
Publisher
arXiv
Research Topics
June 27, 2024
Chris Cummins, Volker Seeker, Dejan Grubisic, Baptiste Rozière, Jonas Gehring, Gabriel Synnaeve, Hugh Leather
June 27, 2024
June 14, 2024
Mostafa Elhoushi, Akshat Shrivastava, Diana Liskovich, Basil Hosmer, Bram Wasti, Liangzhen Lai, Bilge Acun, Ahmed Aly, Beidi Chen, Carole-Jean Wu, Ahmed Roman, Nas Mahmoud, Saurabh Agarwal
June 14, 2024
June 07, 2024
Carole-Jean Wu, Bilge Acun, Ramya Raghavendra, Kim Hazelwood
June 07, 2024
November 07, 2023
Jared Fernandez, Jacob Kahn, Clara Na, Yonatan Bisk, Emma Strubell
November 07, 2023
Foundational models
Latest news
Foundational models