RESEARCH

SPEECH & AUDIO

fairseq: A Fast, Extensible Toolkit for Sequence Modeling

May 27, 2019

Abstract

fairseq is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks. The toolkit is based on PyTorch and supports distributed training across multiple GPUs and machines. We also support fast mixed-precision training and inference on modern GPUs. A demo video can be found here: https://www.youtube.com/watch?v=OtgDdWtHvto.

Download the Paper

AUTHORS

Written by

Michael Auli

Alexei Baevski

Angela Fan

Myle Ott

Nathan Ng

Sam Gross

Sergey Edunov

David Grangier

Publisher

NAACL system track

Related Publications

June 25, 2024

SPEECH & AUDIO

NLP

Textless Acoustic Model with Self-Supervised Distillation for Noise-Robust Expressive Speech-to-Speech Translation

Min-Jae Hwang, Ilia Kulikov, Benjamin Peloquin, Hongyu Gong, Peng-Jen Chen, Ann Lee

June 25, 2024

June 05, 2024

SPEECH & AUDIO

Proactive Detection of Voice Cloning with Localized Watermarking

Robin San Romin, Pierre Fernandez, Hady Elsahar, Alexandre Deffosez, Teddy Furon, Tuan Tran

June 05, 2024

May 24, 2024

SPEECH & AUDIO

NLP

DOC-RAG: ASR Language Model Personalization with Domain-Distributed Co-occurrence Retrieval Augmentation

Zhe Liu

May 24, 2024

April 14, 2024

SPEECH & AUDIO

NLP

Multi-task Learning for Front-end Text Processing in TTS

Yun Wang (Speech), Arthur Hinsvark, Qing He, Shun Zhang, Wonjune Kang

April 14, 2024

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.