February 13, 2026
We introduce a multi-faceted automated red teaming framework in which the goal is to generate multi-modal adversarial conversations that would break a target model and introduce various expansions that would result in more effective and efficient adversarial conversations. The introduced expansions include: 1. Horizontal expansion in which the goal is for the red team model to self-improve and generate more effective conversation starters that would shape a conversation. 2. Vertical expansion in which the goal is to take these conversation starters that are discovered in the horizontal expansion phase and expand them into effective multi-modal conversations and 3. Meta expansion in which the goal is for the red team model to discover more effective multi-modal attack strategies during the course of a conversation. We call our framework FERRET (Framework for Expansion Reliant Red Teaming) and compare it with various existing automated red teaming approaches. In our experiments, we demonstrate the effectiveness of FERRET in generating effective multi-modal adversarial conversations and its superior performance against existing state of the art approaches.
Publisher
arXiv
Research Topics
December 26, 2025
Brandon Amos, Anselm Paulus, Arman Zharmagambetov, Ilia Kulikov, Ivan Evtimov, Kamalika Chaudhuri, Remi Munos
December 26, 2025
September 24, 2025
Aidan Boyd, Alexander Vaughan, Ayaz Minhas, Cristina Menghini, Daniel Song, Dhaval Kapil, Esteban Arcaute, Faizan Ahmad, Felix Binder, Hamza Kwisaba, Jacob Kahn, Jean-Christophe Testud, Jim Gust, Jinpeng Miao, Lauren Deason, Maeve Ryan, Nathaniel Li, Peter Ney, Saisuke Okabayashi, Shengjia Zhao, Spencer Whitman, Summer Yue, Tristan Goodman, Ziwen Han
September 24, 2025
June 13, 2025
Nastaran Okati, Daniel Haimovich, Fridolin Linder, Ido Guy, Lorenzo Perini, Mark Tygert, Niek Tax
June 13, 2025
May 14, 2025
Linnea Evanson, Christine Bulteau, Mathilde Chipaux, Georg Dorfmüller, Sarah Ferrand-Sorbets, Emmanuel Raffo, Sarah Rosenberg, Pierre Bourdillon, Jean Remi King
May 14, 2025

Our approach
Latest news
Foundational models