CORE MACHINE LEARNING

Revisiting Feature Prediction for Learning Visual Representations from Video

February 15, 2024

Abstract

This paper explores feature prediction as a stand-alone objective for unsupervised learning from video and introduces V-JEPA, a collection of vision models trained solely using a feature prediction objective, without the use of pretrained image encoders, text, negative examples, reconstruction, or other sources of supervision. The models are trained on 2 million videos collected from public datasets and are evaluated on downstream image and video tasks. Our results show that learning by predicting video features leads to versatile visual representations that perform well on both motion and appearance-based tasks, without adaption of the model’s parameters; e.g., using a frozen backbone, our largest model, a ViT-H/16 trained only on videos, obtains 81.9% on Kinetics-400, 72.2% on Something-Something-v2, and 77.9% on ImageNet1K.

Download the Paper

AUTHORS

Written by

Adrien Bardes

Quentin Garrido

Xinlei Chen

Michael Rabbat

Yann LeCun

Mido Assran

Nicolas Ballas

Jean Ponce

Publisher

arxiv

Research Topics

Core Machine Learning

Related Publications

July 08, 2024

THEORY

CORE MACHINE LEARNING

An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes

Antonio Orvieto, Lin Xiao

July 08, 2024

June 17, 2024

HUMAN & MACHINE INTELLIGENCE

COMPUTER VISION

D-Flow: Differentiating through Flows for Controlled Generation

Heli Ben-Hamu, Omri Puny, Itai Gat, Brian Karrer, Uriel Singer, Yaron Lipman

June 17, 2024

June 17, 2024

COMPUTER VISION

CORE MACHINE LEARNING

Bespoke Non-Stationary Solvers for Fast Sampling of Diffusion and Flow Models

Neta Shaul, Uriel Singer, Ricky Chen, Matt Le, Ali Thabet, Albert Pumarola, Yaron Lipman

June 17, 2024

June 14, 2024

CORE MACHINE LEARNING

Differentially Private Representation Learning via Image Captioning

Tom Sander, Yaodong Yu, Maziar Sanjabi, Alain Durmus, Yi Ma, Kamalika Chaudhuri, Chuan Guo

June 14, 2024

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.