Holger Schwenk

RESEARCH ENGINEER | PARIS, FRANCE

Holger Schwenk is a research scientist at Facebook Artificial Intelligence Research, Paris. He received his PhD in computer science from the University of Paris 6 in 1996. He then spent one year at the University of Montreal working with Y. Bengio and one year at the International Computer Science Institute in Berkeley. From 1998 to 2007, Holger held an assistant professor position at the University of Paris 11/LIMSI. Prior to joining Facebook in 2015, he was a professor of computer science at the University of Le Mans where he led a large group on statistical machine translation. In 2013, Holger was awarded senior member of the Institut Universitaire de France.

Holger's Publications

December 11, 2024

NLP

Large Concept Models: Language Modeling in a Sentence Representation Space

The LCM team, Loic Barrault, Paul-Ambroise Duquenne, Maha Elbayad, Artyom Kozhevnikov, Belen Alastruey, Pierre Andrews, Mariano Coria, Guillaume Couairon, Marta R. Costa-jussa, David Dale, Hady Elsahar, Kevin Heffernan, João Maria Janeiro, Tuan Tran, Christophe Ropers, Eduardo Sánchez, Robin San Roman, Alexandre Mourachko, Safiyyah Saleem, Holger Schwenk

December 11, 2024

November 30, 2023

SPEECH & AUDIO

NLP

Seamless: Multilingual Expressive and Streaming Speech Translation

Seamless Communication, Loïc Barrault, Yu-An Chung, Mariano Coria Meglioli, David Dale, Ning Dong, Mark Duppenthaler, Paul-Ambroise Duquenne, Brian Ellis, Hady Elsahar, Justin Haaheim, John Hoffman, Min-Jae Hwang, Hirofumi Inaguma, Christopher Klaiber, Ilia Kulikov, Pengwei Li, Daniel Licht, Jean Maillard, Ruslan Mavlyutov, Alice Rakotoarison, Kaushik Ram Sadagopan, Abinesh Ramakrishnan, Tuan Tran, Guillaume Wenzek, Yilin Yang, Ethan Ye, Ivan Evtimov, Pierre Fernandez, Cynthia Gao, Prangthip Hansanti, Elahe Kalbassi, Amanda Kallet, Artyom Kozhevnikov, Gabriel Mejia Gonzalez, Robin San Roman, Christophe Touret, Corinne Wong, Carleigh Wood, Bokai Yu, Pierre Andrews, Can Balioglu, Peng-Jen Chen, Marta R. Costa-jussà, Maha Elbayad, Hongyu Gong, Francisco Guzmán, Kevin Heffernan, Somya Jain, Justine Kao, Ann Lee, Xutai Ma, Alexandre Mourachko, Benjamin Peloquin, Juan Pino, Sravya Popuri, Christophe Ropers, Safiyyah Saleem, Holger Schwenk, Anna Sun, Paden Tomasello, Changhan Wang, Jeff Wang, Skyler Wang, Mary Williamson

November 30, 2023

November 29, 2023

NLP

SONAR EXPRESSIVE: Zero-shot Expressive Speech-to-Speech Translation

Paul-Ambroise Duquenne, Kevin Heffernan, Alexandre Mourachko, Holger Schwenk, Benoit Sagot (INRIA)

November 29, 2023

August 22, 2023

SPEECH & AUDIO

NLP

SeamlessM4T—Massively Multilingual & Multimodal Machine Translation

Seamless Communication, Loic Barrault, Andy Chung, David Dale, Ning Dong (AI), Paul-Ambroise Duquenne, Hady Elsahar, Hongyu Gong, Kevin Heffernan, John Hoffman, Christopher Klaiber, Peng-Jen Chen, Daniel Licht, Jean Maillard, Alice Rakotoarison, Kaushik Ram Sadagopan, Guillaume Wenzek, Abinesh Ramakrishnan, Alexandre Mourachko, Amanda Kallet, Ann Lee, Anna Sun, Bapi Akula, Benjamin Peloquin, Bernie Huang, Bokai Yu, Brian Ellis, Can Balioglu, Carleigh Wood, Changhan Wang, Christophe Ropers, Cynthia Gao, Daniel Li (FAIR), Elahe Kalbassi, Ethan Ye, Gabriel Mejia Gonzalez, Hirofumi Inaguma, Holger Schwenk, Igor Tufanov, Ilia Kulikov, Janice Lam, Jeff Wang (PM - AI), Juan Pino, Justin Haaheim, Justine Kao, Prangthip Hasanti, Kevin Tran, Maha Elbayad, Marta R. Costa-jussa, Mohamed Ramadan, Naji El Hachem, Onur Çelebi, Paco Guzmán, Paden Tomasello, Pengwei Li, Pierre Andrews, Ruslan Mavlyutov, Russ Howes, Safiyyah Saleem, Skyler Wang, Somya Jain, Sravya Popuri, Tuan Tran, Vish Vogeti, Xutai Ma, Yilin Yang

August 22, 2023

August 21, 2023

NLP

SONAR: Sentence-Level Multimodal and Language-Agnostic Representations

Paul-Ambroise Duquenne, Holger Schwenk, Benoit Sagot

August 21, 2023

July 06, 2022

No Language Left Behind: Scaling Human-Centered Machine Translation

Marta Costa-jussa, James Cross, Onur Çelebi, Maha Elbayad, Kenneth Heafield, Kevin Heffernan, Elahe Kalbassi, Janice Lam, Daniel Licht, Jean Maillard, Anna Sun, Skyler Wang, Guillaume Wenzek, Al Youngblood, Bapi Akula, Loic Barrault, Gabriel Mejia Gonzalez, Kae Hansanti, John Hoffman, Semarley Jarrett, Kaushik Ram Sadagopan, Dirk Rowe, Shannon Spruit, Chau Tran, Pierre Andrews, Necip Fazil Ayan, Shruti Bhosale, Sergey Edunov, Angela Fan, Cynthia Gao, Vedanuj Goswami, Francisco Guzmán, Philipp Koehn, Alex Mourachko, Christophe Ropers, Safiyyah Saleem, Holger Schwenk, Jeff Wang (PM - AI)

July 06, 2022

February 06, 2022

RESEARCH

Multimodal and Multilingual Embeddings for Large-Scale Speech Mining

Holger Schwenk, Hongyu Gong, Paul-Ambroise Duquenne

February 06, 2022

July 26, 2020

Searching the Web for Cross-lingual Parallel Data

Ahmed Hassan El-Kishky, Holger Schwenk, Philipp Koehn

July 26, 2020

May 02, 2020

MLQA: Evaluating Cross-lingual Extractive Question Answering

Patrick Lewis, Barlas Oguz, Holger Schwenk, Ruty Rinott, Sebastian Riedel

May 02, 2020

August 02, 2019

RESEARCH

SPEECH & AUDIO

Low-Resource Corpus Filtering using Multilingual Sentence Embeddings

Vishrav Chaudhary, Holger Schwenk, Paco Guzmán, Philipp Koehn, Yuqing Tang

August 02, 2019

October 29, 2018

RESEARCH

SPEECH & AUDIO

XNLI: Evaluating Cross-lingual Sentence Representations

Alexis Conneau, Guillaume Lample, Holger Schwenk, Ruty Rinott, Ves Stoyanov, Adina Williams, Sam Bowman

October 29, 2018