August 6, 2017
The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks. We introduce an architecture based entirely on convolutional neural networks.1 Compared to recurrent models, computations over all elements can be fully parallelized during training to better exploit the GPU hardware and optimization is easier since the number of non-linearities is fixed and independent of the input length. Our use of gated linear units eases gradient propagation and we equip each decoder layer with a separate attention module. We outperform the accuracy of the deep LSTM setup of Wu et al. (2016) on both WMT’14 English-German and WMT’14 English-French translation at an order of magnitude faster speed, both on GPU and CPU.
Research Topics
March 17, 2026
Omnilingual SONAR Team, Ioannis Tsiamas, Yen Meng, Vivek Iyer, Guillem Ramirez, Jaehyeong Jo, Alexandre Mourachko, Yu-An Chung, Artyom Kozhevnikov, Belen Alastruey, Christophe Ropers, David Dale, Holger Schwenk, João Maria Janeiro, Kevin Heffernan, Loic Barrault, Marta R. Costa-jussa, Paul-Ambroise Duquenne, Pere Lluís Huguet Cabot
March 17, 2026
November 10, 2025
Omnilingual ASR team, Skyler Wang, Ife Adebara, Michael Auli, Kaushik Ram Sadagopan, Zheng-Xin Yong, Albert Ventayol-Boada, Alexandre Mourachko, Alexander Erben, Yu-An Chung, Arina Turkatenko, Artyom Kozhevnikov, Caley Drooff, Can Balioglu, Chierh Cheng, Christophe Ropers, Cynthia Gao, Gabriel Mejia Gonzalez, Gil Keren, Jean Maillard, Joe Chuang, Kehan Lyu, Kevin Chan, Mark Duppenthaler, Mary Williamson, Matthew Setzler, Paul-Ambroise Duquenne, Rashel Moritz, Safiyyah Saleem, Sagar Miglani, Shireen Yates, Vineel Pratap, Yen Meng
November 10, 2025
June 27, 2025
Morteza Behrooz, Ning Dong, Jeff Girard, Vasu Sharma, Jan Zikes, Akinniyi Akinyemi, Alex Shcherbyna, Alexander Richard, Alice Rakotoarison, Amia Oberai, Anastasis Stathopoulos, Anna Sun, Antony D'Avirro, Arina Turkatenko, Benjamin Peloquin, Bo Wan, Brandon Han, Carleigh Wood, Chao Wang, Chen Zhang, Christophe Ropers, Christopher Klaiber, Cynthia Gao, Dejan Kovachev, Denise Hernandez, Evonne Ng, Fabian Prada, Fabio Maria Carlucci, Guangyao Ma, Hang Li, Hirofumi Inaguma, Hongyu Gong, Jason Zheng, Jeff Wang, Jie Shen, Jiemin Zhang, Jing Ma, Joe Chuang, Jon Daly, Jovan Popovic, Joy Chen, Juan Pino, Julia Buffalini, Zhiyuan Yao, Junming Chen, Kam-Woh Ng, Kathryn Alvero, Louis-Philippe Morency, Lucas Mantovani, Mark Duppenthaler, Martin Gleize, Martin Ma, Mary Williamson, Michael Zollhoefer, Moneish Kumar, Omid Poursaeed, Paden Tomasello, Pavel Litvin, Pavlo Zhyzheria, Praveen Chowdary, Qingyao Jia, Raj Janardhan, Rongjie Huang, Safiyyah Saleem, Sagar Miglani, Sahir Gomez, Sen He, Shiyang Cheng, Somya Jain, Sreyas Mohan, Srivathsan Govindarajan, Tao Xiang, Tu Anh Nguyen, Tuan Tran, Vasu Agrawal, Wei Liu, Xinyue Zhang, Xutai Ma, Yilei Li, Yilin Yang, Yordan Hristov, Zhang Chen
June 27, 2025
February 06, 2025
Andros Tjandra, Ann Lee, Apoorv Vyas, Baishan Guo, Bowen Shi, Brian Ellis, Carleigh Wood, John Hoffman, Matt Le, Nick Zacharov, Sanyuan Chen, Wei-Ning Hsu, Yi-Chiao Wu
February 06, 2025
October 10, 2016
Matthijs Douze, Hervé Jégou, Florent Perronnin
October 10, 2016
June 18, 2018
Matthijs Douze, Arthur Szlam, Bharath Hariharan, Hervé Jégou
June 18, 2018
July 10, 2018
Denis Yarats, Mike Lewis
July 10, 2018
September 08, 2017
Mike Lewis, Denis Yarats, Yann Dauphin, Devi Parikh, Dhruv Batra
September 08, 2017

Our approach
Latest news
Foundational models