March 01, 2024
Language models (LMs) have been commonly adopted to boost the performance of automatic speech recognition (ASR) particularly in domain adaptation tasks. Conventional way of LM training treats all the words in corpora equally, resulting in suboptimal improvements in ASR performance. In this work, we introduce a novel correction focused LM training approach which aims to prioritize ASR fallible words. The word-level ASR fallibility score, representing the likeli- hood of ASR mis-recognition, is defined and shaped as a prior word distribution to guide the LM training. To enable correction focused training with text-only corpora, large language models (LLMs) are employed as fallibility score predictors and text generators through multi-task fine-tuning. Experimental results for domain adaptation tasks demonstrate the effectiveness of our proposed method. Com- pared with conventional LMs, correction focused training achieves up to relatively 5.5% word error rate (WER) reduction in sufficient text scenarios. In insufficient text scenarios, LM training with LLM- generated text achieves up to relatively 13% WER reduction, while correction focused training further obtains up to relatively 6% WER reduction.
May 12, 2026
Corentin Bel, Linnea Evanson, Julien Gadonneix, Andrea Santos Revilla, Mingfang (Lucy) Zhang, Julie Bonnaire, Charlotte Caucheteux, Alexandre Défossez, Théo Desbordes, Pablo Diego-Simón, Shubh Khanna, Juliette Millet, Pierre Orhan, Saarang Panchavati, Antoine Ratouchniak, Alexis Thual, Hubert Jacob Banville, Jarod Levy, Jean Remi King, Josephine Raugel, Jérémy Rapin, Katelyn Begany, Marlene Careil, Simon Dahan, Sophia Houhamdi, Stéphane d'Ascoli, Teon Brooks, Yohann Benchetrit
May 12, 2026
March 17, 2026
Omnilingual SONAR Team, Ioannis Tsiamas, Yen Meng, Vivek Iyer, Guillem Ramirez, Jaehyeong Jo, Alexandre Mourachko, Yu-An Chung, Artyom Kozhevnikov, Belen Alastruey, Christophe Ropers, David Dale, Holger Schwenk, João Maria Janeiro, Kevin Heffernan, Loic Barrault, Marta R. Costa-jussa, Paul-Ambroise Duquenne, Pere Lluís Huguet Cabot
March 17, 2026
December 16, 2025
Yi-Chiao Wu, Julius Richter, Andros Tjandra, Ann Lee, Apoorv Vyas, Bowen Shi, Christoph Feichtenhofer, Helin Wang, John Hoffman, Luya Gao, Matt Le, Piotr Dollar, Sanyuan Chen, Wei-Ning Hsu
December 16, 2025
December 16, 2025
Heng-Jui Chang, Cheng-Fu Yang, Julius Richter, Ann Lee, Apoorv Vyas, Bernie Huang, Christoph Feichtenhofer, Luya Gao, Matt Le, Piotr Dollar, Sanyuan Chen, Wei-Ning Hsu
December 16, 2025

Our approach
Latest news
Foundational models