August 23, 2019
Following the WMT 2018 Shared Task on Parallel Corpus Filtering (Koehn et al., 2018), we posed the challenge of assigning sentencelevel quality scores for very noisy corpora of sentence pairs crawled from the web, with the goal of sub-selecting 2% and 10% of the highest-quality data to be used to train machine translation systems. This year, the task tackled the low resource condition of Nepali– English and Sinhala–English. Eleven participants from companies, national research labs, and universities participated in this task.
Publisher
WMT ACL
July 23, 2024
Llama team
July 23, 2024
June 25, 2024
Min-Jae Hwang, Ilia Kulikov, Benjamin Peloquin, Hongyu Gong, Peng-Jen Chen, Ann Lee
June 25, 2024
June 05, 2024
Robin San Romin, Pierre Fernandez, Hady Elsahar, Alexandre Deffosez, Teddy Furon, Tuan Tran
June 05, 2024
May 24, 2024
May 24, 2024
Product experiences
Foundational models
Product experiences
Latest news
Foundational models