Research

Speech & Audio

Convolutional Sequence to Sequence Learning

August 6, 2017

Abstract

The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks. We introduce an architecture based entirely on convolutional neural networks.1 Compared to recurrent models, computations over all elements can be fully parallelized during training to better exploit the GPU hardware and optimization is easier since the number of non-linearities is fixed and independent of the input length. Our use of gated linear units eases gradient propagation and we equip each decoder layer with a separate attention module. We outperform the accuracy of the deep LSTM setup of Wu et al. (2016) on both WMT’14 English-German and WMT’14 English-French translation at an order of magnitude faster speed, both on GPU and CPU.

Download the Paper

Related Publications

March 17, 2026

Speech & Audio

Omnilingual SONAR: Cross-Lingual and Cross-Modal Sentence Embeddings Bridging Massively Multilingual Text and Speech

Omnilingual SONAR Team, Ioannis Tsiamas, Yen Meng, Vivek Iyer, Guillem Ramirez, Jaehyeong Jo, Alexandre Mourachko, Yu-An Chung, Artyom Kozhevnikov, Belen Alastruey, Christophe Ropers, David Dale, Holger Schwenk, João Maria Janeiro, Kevin Heffernan, Loic Barrault, Marta R. Costa-jussa, Paul-Ambroise Duquenne, Pere Lluís Huguet Cabot

March 17, 2026

November 10, 2025

Speech & Audio

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

Omnilingual ASR team, Skyler Wang, Ife Adebara, Michael Auli, Kaushik Ram Sadagopan, Zheng-Xin Yong, Albert Ventayol-Boada, Alexandre Mourachko, Alexander Erben, Yu-An Chung, Arina Turkatenko, Artyom Kozhevnikov, Caley Drooff, Can Balioglu, Chierh Cheng, Christophe Ropers, Cynthia Gao, Gabriel Mejia Gonzalez, Gil Keren, Jean Maillard, Joe Chuang, Kehan Lyu, Kevin Chan, Mark Duppenthaler, Mary Williamson, Matthew Setzler, Paul-Ambroise Duquenne, Rashel Moritz, Safiyyah Saleem, Sagar Miglani, Shireen Yates, Vineel Pratap, Yen Meng

November 10, 2025

June 27, 2025

Human & Machine Intelligence

Conversational AI

Seamless Interaction: Dyadic Audiovisual Motion Modeling and Large-Scale Dataset

Morteza Behrooz, Ning Dong, Jeff Girard, Vasu Sharma, Jan Zikes, Akinniyi Akinyemi, Alex Shcherbyna, Alexander Richard, Alice Rakotoarison, Amia Oberai, Anastasis Stathopoulos, Anna Sun, Antony D'Avirro, Arina Turkatenko, Benjamin Peloquin, Bo Wan, Brandon Han, Carleigh Wood, Chao Wang, Chen Zhang, Christophe Ropers, Christopher Klaiber, Cynthia Gao, Dejan Kovachev, Denise Hernandez, Evonne Ng, Fabian Prada, Fabio Maria Carlucci, Guangyao Ma, Hang Li, Hirofumi Inaguma, Hongyu Gong, Jason Zheng, Jeff Wang, Jie Shen, Jiemin Zhang, Jing Ma, Joe Chuang, Jon Daly, Jovan Popovic, Joy Chen, Juan Pino, Julia Buffalini, Zhiyuan Yao, Junming Chen, Kam-Woh Ng, Kathryn Alvero, Louis-Philippe Morency, Lucas Mantovani, Mark Duppenthaler, Martin Gleize, Martin Ma, Mary Williamson, Michael Zollhoefer, Moneish Kumar, Omid Poursaeed, Paden Tomasello, Pavel Litvin, Pavlo Zhyzheria, Praveen Chowdary, Qingyao Jia, Raj Janardhan, Rongjie Huang, Safiyyah Saleem, Sagar Miglani, Sahir Gomez, Sen He, Shiyang Cheng, Somya Jain, Sreyas Mohan, Srivathsan Govindarajan, Tao Xiang, Tu Anh Nguyen, Tuan Tran, Vasu Agrawal, Wei Liu, Xinyue Zhang, Xutai Ma, Yilei Li, Yilin Yang, Yordan Hristov, Zhang Chen

June 27, 2025

February 06, 2025

Speech & Audio

Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and Sound

Andros Tjandra, Ann Lee, Apoorv Vyas, Baishan Guo, Bowen Shi, Brian Ellis, Carleigh Wood, John Hoffman, Matt Le, Nick Zacharov, Sanyuan Chen, Wei-Ning Hsu, Yi-Chiao Wu

February 06, 2025

October 10, 2016

Speech & Audio

Computer Vision

Polysemous Codes | Facebook AI Research

Matthijs Douze, Hervé Jégou, Florent Perronnin

October 10, 2016

June 18, 2018

Speech & Audio

Computer Vision

Low-shot learning with large-scale diffusion | Facebook AI Research

Matthijs Douze, Arthur Szlam, Bharath Hariharan, Hervé Jégou

June 18, 2018

July 10, 2018

NLP

Speech & Audio

Hierarchical Text Generation and Planning for Strategic Dialogue | Facebook AI Research

Denis Yarats, Mike Lewis

July 10, 2018

September 08, 2017

NLP

Speech & Audio

Deal or No Deal? End-to-End Learning for Negotiation Dialogues | Facebook AI Research

Mike Lewis, Denis Yarats, Yann Dauphin, Devi Parikh, Dhruv Batra

September 08, 2017

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.