August 23, 2019
Following the WMT 2018 Shared Task on Parallel Corpus Filtering (Koehn et al., 2018), we posed the challenge of assigning sentencelevel quality scores for very noisy corpora of sentence pairs crawled from the web, with the goal of sub-selecting 2% and 10% of the highest-quality data to be used to train machine translation systems. This year, the task tackled the low resource condition of Nepali– English and Sinhala–English. Eleven participants from companies, national research labs, and universities participated in this task.
Publisher
WMT ACL
April 14, 2024
Heng-Jui Chang, Ning Dong (AI), Ruslan Mavlyutov, Sravya Popuri, Andy Chung
April 14, 2024
March 05, 2024
Alex Liu, Matt Le, Apoorv Vyas, Bowen Shi, Andros Tjandra, Wei-Ning Hsu
March 05, 2024
December 11, 2023
Wei-Ning Hsu, Akinniyi Akinyemi, Alice Rakotoarison, Andros Tjandra, Apoorv Vyas, Baishan Guo, Bapi Akula, Bowen Shi, Brian Ellis, Ivan Cruz, Jeff Wang, Jiemin Zhang, Mary Williamson, Matt Le, Rashel Moritz, Robbie Adkins, William Ngan, Xinyue Zhang, Yael Yungster, Yi-Chiao Wu
December 11, 2023
November 30, 2023
Xutai Ma, Anna Sun, Siqi Ouyang, Hirofumi Inaguma, Paden Tomasello
November 30, 2023
Product experiences
Foundational models
Product experiences
Latest news
Foundational models