CONVERSATIONAL AI

NLP

ROBBIE: Robust Bias Evaluation of Large Generative Language Models

November 06, 2023

Abstract

As generative large language models (LLMs) grow more performant and prevalent, we must develop comprehensive enough tools to measure and improve their fairness. Different prompt-based datasets can be used to measure social bias across multiple text domains and demographic axes, meaning that testing LLMs on more datasets can potentially help us characterize their biases more fully, and better ensure equal and equitable treatment of marginalized demographic groups. In this work, our focus is two-fold: (1) Benchmarking: a comparison of 6 different prompt-based bias and toxicity metrics across 12 demographic axes and 5 families of generative LLMs. Out of those 6 metrics, AdvPromptSet and HolisticBiasR are novel datasets proposed in the paper. The comparison of those benchmarks gives us insights about the bias and toxicity of the compared models. Therefore, we explore the frequency of demographic terms in common LLM pre-training corpora and how this may relate to model biases. (2) Mitigation: we conduct a comprehensive study of how well 3 bias/toxicity mitigation techniques perform across our suite of measurements. ROBBIE aims to provide insights for practitioners while deploying a model, emphasizing the need to not only measure potential harms, but also understand how they arise by characterizing the data, mitigate harms once found, and balance any trade-offs. We open-source our analysis code in hopes of encouraging broader measurements of bias in future LLMs.

Download the Paper

AUTHORS

Written by

David Esiobu

Ellen Tan

Saghar Hosseini

Megan Ung

Yuchen Zhang

Jude Fernandes

Jane Yu

Alex Presani

Adina Williams

Eric Smith

Publisher

EMNLP

Related Publications

April 17, 2025

HUMAN & MACHINE INTELLIGENCE

CONVERSATIONAL AI

Collaborative Reasoner: Self-improving Social Agents with Synthetic Conversations

Ansong Ni, Ruta Desai, Yang Li, Xinjie Lei, Dong Wang, Ramya Raghavendra, Gargi Ghosh, Daniel Li (FAIR), Asli Celikyilmaz

April 17, 2025

April 04, 2025

NLP

CORE MACHINE LEARNING

Multi-Token Attention

Olga Golovneva, Tianlu Wang, Jason Weston, Sainbayar Sukhbaatar

April 04, 2025

March 13, 2025

NLP

COMPUTER VISION

Subobject-level Image Tokenization

Delong Chen, Samuel Cahyawijaya, Jianfeng Liu, Baoyuan Wang, Pascale Fung

March 13, 2025

February 07, 2025

NLP

BOUQuET: dataset, Benchmark and Open initiative for Universal Quality Evaluation in Translation

The Omnilingual MT Team, Pierre Andrews, Mikel Artetxe, Mariano Coria Meglioli, Marta R. Costa-jussa, Joe Chuang, David Dale, Cynthia Gao, Jean Maillard, Alexandre Mourachko, Christophe Ropers, Safiyyah Saleem, Eduardo Sánchez, Yiannis Tsiamas, Arina Turkatenko, Albert Ventayol, Shireen Yates

February 07, 2025

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.