Holger Schwenk

RESEARCH ENGINEER | PARIS, FRANCE

Holger Schwenk is a research scientist at Facebook Artificial Intelligence Research, Paris. He received his PhD in computer science from the University of Paris 6 in 1996. He then spent one year at the University of Montreal working with Y. Bengio and one year at the International Computer Science Institute in Berkeley. From 1998 to 2007, Holger held an assistant professor position at the University of Paris 11/LIMSI. Prior to joining Facebook in 2015, he was a professor of computer science at the University of Le Mans where he led a large group on statistical machine translation. In 2013, Holger was awarded senior member of the Institut Universitaire de France.

Holger's Publications

March 17, 2026

RESEARCH

NLP

Omnilingual MT: Machine Translation for 1,600 Languages

Omnilingual MT Team, Niyati Bafna, Ioannis Tsiamas, Mark Duppenthaler, Albert Ventayol-Boada, Alexandre Mourachko, Andrea Caciolai, Arina Turkatenko, Artyom Kozhevnikov, Belen Alastruey, Charles-Eric Saint-James, Chierh CHENG, Christophe Ropers, Cynthia Gao, David Dale, Edan Toledo, Eduardo Sánchez, Gabriel Mejia Gonzalez, Holger Schwenk, Jean Maillard, Joe Chuang, João Maria Janeiro, Kevin Heffernan, Marta R. Costa-jussa, Mary Williamson, Nate Ekberg, Paul-Ambroise Duquenne, Pere Lluís Huguet Cabot, Rashel Moritz, Shireen Yates, Surya Parimi

March 17, 2026

March 17, 2026

RESEARCH

SPEECH & AUDIO

Omnilingual SONAR: Cross-Lingual and Cross-Modal Sentence Embeddings Bridging Massively Multilingual Text and Speech

Omnilingual SONAR Team, Ioannis Tsiamas, Yen Meng, Vivek Iyer, Guillem Ramirez, Jaehyeong Jo, Alexandre Mourachko, Yu-An Chung, Artyom Kozhevnikov, Belen Alastruey, Christophe Ropers, David Dale, Holger Schwenk, João Maria Janeiro, Kevin Heffernan, Loic Barrault, Marta R. Costa-jussa, Paul-Ambroise Duquenne, Pere Lluís Huguet Cabot

March 17, 2026

February 27, 2026

HUMAN & MACHINE INTELLIGENCE

RESEARCH

Unified Vision–Language Modeling via Concept Space Alignment

Yifu Qiu, Holger Schwenk, Paul-Ambroise Duquenne

February 27, 2026

December 11, 2024

NLP

Large Concept Models: Language Modeling in a Sentence Representation Space

The LCM team, Guillaume Couairon, Alexandre Mourachko, Artyom Kozhevnikov, Belen Alastruey, Christophe Ropers, David Dale, Eduardo Sánchez, Hady Elsahar, Holger Schwenk, João Maria Janeiro, Kevin Heffernan, Loic Barrault, Maha Elbayad, Mariano Coria, Marta R. Costa-jussa, Paul-Ambroise Duquenne, Pierre Andrews, Robin San Roman, Safiyyah Saleem, Tuan Tran

December 11, 2024

November 30, 2023

SPEECH & AUDIO

NLP

Seamless: Multilingual Expressive and Streaming Speech Translation

Seamless Communication, Elahe Kalbassi, Xutai Ma, Abinesh Ramakrishnan, Alexandre Mourachko, Alice Rakotoarison, Amanda Kallet, Yu-An Chung, Ann Lee, Anna Sun, Artyom Kozhevnikov, Benjamin Peloquin, Bokai Yu, Brian Ellis, Can Balioglu, Carleigh Wood, Changhan Wang, Christophe Ropers, Christophe Touret, Christopher Klaiber, Corinne Wong, Cynthia Gao, Daniel Licht, David Dale, Ethan Ye, Gabriel Mejia Gonzalez, Guillaume Wenzek, Hady Elsahar, Hirofumi Inaguma, Holger Schwenk, Hongyu Gong, Ilia Kulikov, Ivan Evtimov, Jean Maillard, Jeff Wang, John Hoffman, Juan Pino, Justin Haaheim, Justine Kao, Prangthip Hansanti, Kaushik Ram Sadagopan, Kevin Heffernan, Loïc Barrault, Maha Elbayad, Mariano Coria Meglioli, Mark Duppenthaler, Marta R. Costa-jussà, Mary Williamson, Min-Jae Hwang, Ning Dong, Francisco Guzmán, Paden Tomasello, Paul-Ambroise Duquenne, Peng-Jen Chen, Pengwei Li, Pierre Andrews, Pierre Fernandez, Robin San Roman, Ruslan Mavlyutov, Safiyyah Saleem, Skyler Wang, Somya Jain, Sravya Popuri, Tuan Tran, Yilin Yang

November 30, 2023

November 29, 2023

NLP

SONAR EXPRESSIVE: Zero-shot Expressive Speech-to-Speech Translation

Holger Schwenk, Alexandre Mourachko, Kevin Heffernan, Paul-Ambroise Duquenne, Benoit Sagot (INRIA)

November 29, 2023

August 22, 2023

SPEECH & AUDIO

NLP

SeamlessM4T—Massively Multilingual & Multimodal Machine Translation

Seamless Communication, Safiyyah Saleem, Abinesh Ramakrishnan, Alexandre Mourachko, Alice Rakotoarison, Amanda Kallet, Andy Chung, Ann Lee, Anna Sun, Bapi Akula, Benjamin Peloquin, Bernie Huang, Bokai Yu, Brian Ellis, Can Balioglu, Carleigh Wood, Changhan Wang, Christophe Ropers, Christopher Klaiber, Cynthia Gao, Daniel Li (FAIR), Daniel Licht, David Dale, Elahe Kalbassi, Ethan Ye, Gabriel Mejia Gonzalez, Guillaume Wenzek, Hady Elsahar, Hirofumi Inaguma, Holger Schwenk, Hongyu Gong, Igor Tufanov, Ilia Kulikov, Janice Lam, Jean Maillard, Jeff Wang (PM - AI), John Hoffman, Juan Pino, Justin Haaheim, Justine Kao, Prangthip Hasanti, Kaushik Ram Sadagopan, Kevin Heffernan, Kevin Tran, Loic Barrault, Maha Elbayad, Marta R. Costa-jussa, Mohamed Ramadan, Naji El Hachem, Ning Dong (AI), Onur Çelebi, Paco Guzmán, Paden Tomasello, Paul-Ambroise Duquenne, Peng-Jen Chen, Pengwei Li, Pierre Andrews, Ruslan Mavlyutov, Russ Howes, Skyler Wang, Somya Jain, Sravya Popuri, Tuan Tran, Vish Vogeti, Xutai Ma, Yilin Yang

August 22, 2023

August 21, 2023

NLP

SONAR: Sentence-Level Multimodal and Language-Agnostic Representations

Benoit Sagot, Holger Schwenk, Paul-Ambroise Duquenne

August 21, 2023

July 06, 2022

No Language Left Behind: Scaling Human-Centered Machine Translation

Shannon Spruit, Chau Tran, Marta Costa-jussa, Al Youngblood, Alex Mourachko, Angela Fan, Anna Sun, Bapi Akula, Christophe Ropers, Cynthia Gao, Daniel Licht, Dirk Rowe, Elahe Kalbassi, Francisco Guzmán, Gabriel Mejia Gonzalez, Guillaume Wenzek, Holger Schwenk, James Cross, Janice Lam, Jean Maillard, Jeff Wang (PM - AI), John Hoffman, Kae Hansanti, Kaushik Ram Sadagopan, Kenneth Heafield, Kevin Heffernan, Loic Barrault, Maha Elbayad, Necip Fazil Ayan, Onur Çelebi, Philipp Koehn, Pierre Andrews, Safiyyah Saleem, Semarley Jarrett, Sergey Edunov, Shruti Bhosale, Skyler Wang, Vedanuj Goswami

July 06, 2022

February 06, 2022

RESEARCH

Multimodal and Multilingual Embeddings for Large-Scale Speech Mining

Holger Schwenk, Hongyu Gong, Paul-Ambroise Duquenne

February 06, 2022

July 26, 2020

Searching the Web for Cross-lingual Parallel Data

Ahmed Hassan El-Kishky, Holger Schwenk, Philipp Koehn

July 26, 2020

May 02, 2020

MLQA: Evaluating Cross-lingual Extractive Question Answering

Patrick Lewis, Barlas Oguz, Holger Schwenk, Ruty Rinott, Sebastian Riedel

May 02, 2020

August 02, 2019

RESEARCH

SPEECH & AUDIO

Low-Resource Corpus Filtering using Multilingual Sentence Embeddings

Vishrav Chaudhary, Holger Schwenk, Paco Guzmán, Philipp Koehn, Yuqing Tang

August 02, 2019

October 29, 2018

RESEARCH

SPEECH & AUDIO

XNLI: Evaluating Cross-lingual Sentence Representations

Alexis Conneau, Guillaume Lample, Holger Schwenk, Ruty Rinott, Ves Stoyanov, Adina Williams, Sam Bowman

October 29, 2018