NLP

CORE MACHINE LEARNING

Llama Guard 3-1B-INT4: Compact and Efficient Safeguard for Human-AI Conversations

November 20, 2024

Abstract

This paper presents Llama Guard 3-1B-INT4, a compact and efficient Llama Guard model, which has been open-sourced to the community during Meta Connect 2024. We demonstrate that Llama Guard 3-1B-INT4 can be deployed on resource-constrained devices, achieving a throughput of at least 30 tokens per second and a time-to-first-token of 2.5 seconds or less on a commodity Android mobile CPU. Notably, our experiments show that Llama Guard 3-1B-INT4 attains comparable or superior safety moderation scores to its larger counterpart, Llama Guard 3-1B, despite being approximately 7 times smaller in size (440MB).

Download the Paper

AUTHORS

Written by

Igor Fedorov

Kate Plawiak

Lemeng Wu

Tarek Elgamal

Naveen Suda

Eric Smith

Hongyuan Zhan

Jianfeng Chi

Yuriy Hulovatyy

Kimish Patel

Zechun Liu

Yangyang Shi

Tijmen Blankevoort

Mahesh Pasupuleti

Bilge Soran

Zacharie Delpierre Coudert

Rachad Alao

Raghuraman Krishnamoorthi

Vikas Chandra

Publisher

arXiv

Research Topics

Natural Language Processing (NLP)

Core Machine Learning

Related Publications

November 19, 2024

NLP

Adaptive Decoding via Latent Preference Optimization

Shehzaad Dhuliawala, Ilia Kulikov, Ping Yu, Asli Celikyilmaz, Jason Weston, Sainbayar Sukhbaatar, Jack Lanchantin

November 19, 2024

November 14, 2024

NLP

CORE MACHINE LEARNING

A Survey on Deep Learning for Theorem Proving

Zhaoyu Li, Jialiang Sun, Logan Murphy, Qidong Su, Zenan Li, Xian Zhang, Kaiyu Yang, Xujie Si

November 14, 2024

November 06, 2024

THEORY

CORE MACHINE LEARNING

The Road Less Scheduled

Aaron Defazio, Alice Yang, Harsh Mehta, Konstantin Mishchenko, Ahmed Khaled, Ashok Cutkosky

November 06, 2024

October 04, 2024

HUMAN & MACHINE INTELLIGENCE

CONVERSATIONAL AI

Beyond Turn-Based Interfaces: Synchronous LLMs as Full-Duplex Dialogue Agents

Bandhav Veluri, Benjamin Peloquin, Bokai Yu, Hongyu Gong, Shyam Gollakota

October 04, 2024

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.