SPEECH & AUDIO

NLP

Cocktail HuBERT: Generalized Self-Supervised Pre-Training for Mixture and Single-Source Speech

March 27, 2023

Abstract

Self-supervised learning leverages unlabeled data effectively, improving label efficiency and generalization to domains without labeled data. While recent work has studied generalization to more acoustic/linguistic domains, languages, and modalities, these investigations are limited to single-source speech with one primary speaker in the recording. This paper presents Cocktail HuBERT, a self-supervised learning framework that generalizes to mixture speech using a masked pseudo source separation objective. This objective encourages the model to identify the number of sources, separate and understand the context, and infer the content of masked regions represented as discovered units. Cocktail HuBERT outperforms state-of-the-art results with 69% lower WER on multispeaker ASR, 31% lower DER on diarization, and is competitive on single- and multi-speaker tasks from SUPERB.

Download the Paper

Related Publications

May 14, 2025

HUMAN & MACHINE INTELLIGENCE

SPEECH & AUDIO

Emergence of Language in the Developing Brain

Linnea Evanson, Christine Bulteau, Mathilde Chipaux, Georg Dorfmüller, Sarah Ferrand-Sorbets, Emmanuel Raffo, Sarah Rosenberg, Pierre Bourdillon, Jean Remi King

May 14, 2025

April 25, 2025

RESEARCH

NLP

ReasonIR: Training Retrievers for Reasoning Tasks

Rulin Shao, Qiao Rui, Varsha Kishore, Niklas Muennighoff, Victoria Lin, Daniela Rus, Bryan Kian Hsiang Low, Sewon Min, Scott Yih, Pang Wei Koh, Luke Zettlemoyer

April 25, 2025

April 17, 2025

HUMAN & MACHINE INTELLIGENCE

CONVERSATIONAL AI

Collaborative Reasoner: Self-improving Social Agents with Synthetic Conversations

Ansong Ni, Ruta Desai, Yang Li, Xinjie Lei, Dong Wang, Ramya Raghavendra, Gargi Ghosh, Daniel Li (FAIR), Asli Celikyilmaz

April 17, 2025

April 04, 2025

NLP

CORE MACHINE LEARNING

Multi-Token Attention

Olga Golovneva, Tianlu Wang, Jason Weston, Sainbayar Sukhbaatar

April 04, 2025

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.