Ann Lee

RESEARCH SCIENTIST | NEW YORK CITY, UNITED STATES

Ann is a Research Engineer at Facebook AI Research (FAIR), focusing on speech recognition. She earned a Ph.D. at MIT's Spoken Language Systems group, where she worked on computer-assisted pronunciation training and mispronunciation detection in non-native speech.

Ann's Publications

December 16, 2025

SPEECH & AUDIO

COMPUTER VISION

SAM Audio: Segment Anything in Audio

Yi-Chiao Wu, Julius Richter, Andros Tjandra, Ann Lee, Apoorv Vyas, Bowen Shi, Christoph Feichtenhofer, Helin Wang, John Hoffman, Luya Gao, Matt Le, Piotr Dollar, Sanyuan Chen, Wei-Ning Hsu

December 16, 2025

December 16, 2025

SPEECH & AUDIO

COMPUTER VISION

Pushing the Frontier of Audiovisual Perception with Large-Scale Multimodal Correspondence Learning

Heng-Jui Chang, Cheng-Fu Yang, Julius Richter, Ann Lee, Apoorv Vyas, Bernie Huang, Christoph Feichtenhofer, Luya Gao, Matt Le, Piotr Dollar, Sanyuan Chen, Wei-Ning Hsu

December 16, 2025

February 07, 2025

RESEARCH

SPEECH & AUDIO

Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and Sound

Andros Tjandra, Ann Lee, Apoorv Vyas, Baishan Guo, Bowen Shi, Brian Ellis, Carleigh Wood, John Hoffman, Matt Le, Nick Zacharov, Sanyuan Chen, Wei-Ning Hsu, Yi-Chiao Wu

February 07, 2025

June 25, 2024

SPEECH & AUDIO

NLP

Textless Acoustic Model with Self-Supervised Distillation for Noise-Robust Expressive Speech-to-Speech Translation

Min-Jae Hwang, Ann Lee, Benjamin Peloquin, Hongyu Gong, Ilia Kulikov, Peng-Jen Chen

June 25, 2024

November 30, 2023

SPEECH & AUDIO

NLP

Seamless: Multilingual Expressive and Streaming Speech Translation

Seamless Communication, Elahe Kalbassi, Xutai Ma, Abinesh Ramakrishnan, Alexandre Mourachko, Alice Rakotoarison, Amanda Kallet, Yu-An Chung, Ann Lee, Anna Sun, Artyom Kozhevnikov, Benjamin Peloquin, Bokai Yu, Brian Ellis, Can Balioglu, Carleigh Wood, Changhan Wang, Christophe Ropers, Christophe Touret, Christopher Klaiber, Corinne Wong, Cynthia Gao, Daniel Licht, David Dale, Ethan Ye, Gabriel Mejia Gonzalez, Guillaume Wenzek, Hady Elsahar, Hirofumi Inaguma, Holger Schwenk, Hongyu Gong, Ilia Kulikov, Ivan Evtimov, Jean Maillard, Jeff Wang, John Hoffman, Juan Pino, Justin Haaheim, Justine Kao, Prangthip Hansanti, Kaushik Ram Sadagopan, Kevin Heffernan, Loïc Barrault, Maha Elbayad, Mariano Coria Meglioli, Mark Duppenthaler, Marta R. Costa-jussà, Mary Williamson, Min-Jae Hwang, Ning Dong, Francisco Guzmán, Paden Tomasello, Paul-Ambroise Duquenne, Peng-Jen Chen, Pengwei Li, Pierre Andrews, Pierre Fernandez, Robin San Roman, Ruslan Mavlyutov, Safiyyah Saleem, Skyler Wang, Somya Jain, Sravya Popuri, Tuan Tran, Yilin Yang

November 30, 2023

August 22, 2023

SPEECH & AUDIO

NLP

SeamlessM4T—Massively Multilingual & Multimodal Machine Translation

Seamless Communication, Safiyyah Saleem, Abinesh Ramakrishnan, Alexandre Mourachko, Alice Rakotoarison, Amanda Kallet, Andy Chung, Ann Lee, Anna Sun, Bapi Akula, Benjamin Peloquin, Bernie Huang, Bokai Yu, Brian Ellis, Can Balioglu, Carleigh Wood, Changhan Wang, Christophe Ropers, Christopher Klaiber, Cynthia Gao, Daniel Li (FAIR), Daniel Licht, David Dale, Elahe Kalbassi, Ethan Ye, Gabriel Mejia Gonzalez, Guillaume Wenzek, Hady Elsahar, Hirofumi Inaguma, Holger Schwenk, Hongyu Gong, Igor Tufanov, Ilia Kulikov, Janice Lam, Jean Maillard, Jeff Wang (PM - AI), John Hoffman, Juan Pino, Justin Haaheim, Justine Kao, Prangthip Hasanti, Kaushik Ram Sadagopan, Kevin Heffernan, Kevin Tran, Loic Barrault, Maha Elbayad, Marta R. Costa-jussa, Mohamed Ramadan, Naji El Hachem, Ning Dong (AI), Onur Çelebi, Paco Guzmán, Paden Tomasello, Paul-Ambroise Duquenne, Peng-Jen Chen, Pengwei Li, Pierre Andrews, Ruslan Mavlyutov, Russ Howes, Skyler Wang, Somya Jain, Sravya Popuri, Tuan Tran, Vish Vogeti, Xutai Ma, Yilin Yang

August 22, 2023

June 20, 2022

Flashlight: Enabling Innovation in Tools for Machine Learning

Jacob Kahn, Ann Lee, Benoit Steiner, Edouard Grave, Gabriel Synnaeve, Gilad Avidov, Paden Tomasello, Qiantong Xu, Ronan Collobert, Vineel Pratap, Vitaliy Liptchinksy, Awni Hannun, Jeff Cai, Tatiana Likhomanenko

June 20, 2022

April 30, 2020

RESEARCH

NLP

Self-Training for End-to-End Speech Recognition

Jacob Kahn, Ann Lee, Awni Hannun

April 30, 2020

September 13, 2019

RESEARCH

SPEECH & AUDIO

Sequence-to-Sequence Speech Recognition with Time-Depth Separable Convolutions

Awni Hannun, Ann Lee, Qiantong Xu, Ronan Collobert

September 13, 2019