May 1, 2019
Current fully-supervised video datasets consist of only a few hundred thousand videos and fewer than a thousand domain-specific labels. This hinders the progress towards advanced video architectures. This paper presents an in-depth study of using large volumes of web videos for pre-training video models for the task of action recognition. Our primary empirical finding is that pre-training at a very large scale (over 65 million videos), despite on noisy social-media videos and hashtags, substantially improves the state-of-the-art on three challenging public action recognition datasets. Further, we examine three questions in the construction of weakly-supervised video action datasets. First, given that actions involve interactions with objects, how should one construct a verb-object pre-training label space to benefit transfer learning the most? Second, frame-based models perform quite well on action recognition; is pre-training for good image features sufficient or is pre-training for spatio-temporal features valuable for optimal transfer learning? Finally, actions are generally less well-localized in long videos vs. short videos; since action labels are provided at a video level, how should one choose video clips for best performance, given some fixed budget of number or minutes of videos?
Research Topics
February 06, 2025
Andros Tjandra, Yi-Chiao Wu, Baishan Guo, John Hoffman, Brian Ellis, Apoorv Vyas, Bowen Shi, Sanyuan Chen, Matt Le, Nick Zacharov, Carleigh Wood, Ann Lee, Wei-Ning Hsu
February 06, 2025
November 19, 2020
Angela Fan, Aleksandra Piktus, Antoine Bordes, Fabio Petroni, Guillaume Wenzek, Marzieh Saeidi, Sebastian Riedel, Andreas Vlachos
November 19, 2020
November 09, 2020
Angela Fan
November 09, 2020
October 26, 2020
Xian Li, Asa Cooper Stickland, Xiang Kong, Yuqing Tang
October 26, 2020
December 11, 2019
Eliya Nachmani, Lior Wolf
December 11, 2019
April 30, 2018
Yaniv Taigman, Lior Wolf, Adam Polyak, Eliya Nachmani
April 30, 2018
July 11, 2018
Eliya Nachmani, Adam Polyak, Yaniv Taigman, Lior Wolf
July 11, 2018
May 05, 2019
Noam Mor, Lior Wolf, Adam Polyak, Yaniv Taigman
May 05, 2019
Foundational models
Our approach
Latest news
Foundational models