Resources

Datasets

Large-scale datasets and benchmarks for training, evaluating, and testing models to measure and advance AI progress.

Featured Dataset

SA-V Dataset

SA-V is a dataset designed for training general-purpose object segmentation models from open world videos. The dataset was introduced in our paper “Segment Anything 2”.

Overview

Datasets

FACET Dataset

FACET is a comprehensive benchmark dataset designed for measuring or evaluating the robustness and algorithmic fairness of AI and machine-learning vision models for protected groups.

EgoTV Dataset

A benchmark and dataset for systematic investigation of vision-language models on compositional, causal (e.g., effect of actions), and temporal (e.g., action ordering) reasoning in egocentric settings.

MMCSG Dataset

The MMCSG (Multi-Modal Conversations in Smart Glasses) dataset comprises two-sided conversations recorded using Aria glasses, featuring multi-modal data such as multi-channel audio, video, accelerometer, and gyroscope measurements.

Speech Fairness Dataset

Designed for training general-purpose object segmentation models from open world images.

Casual Conversations V2

For evaluating computer vision, audio and speech models for accuracy across a diverse set of ages, genders, language/dialects, geographies, disabilities, and more.

Casual Conversations

For evaluating computer vision and audio models for accuracy across a diverse set of age, genders, apparent skin tones and ambient lighting conditions.

Common Objects in 3D (CO3D)

For learning category-specific 3D reconstruction and new-view synthesis using multi-view images of common object categories.

Segment Anything

Designed for training general-purpose object segmentation models from open world images.

DISC21 Dataset

Helps researchers evaluate their image copy detection models for accuracy.

EgoObjects Dataset

A project that seeks to advance the fundamental AI research needed for multi-modal machine perception for first-person video understanding.

FLoRes Benchmarking Dataset

Used for machine translation between English and low-resource languages.

Ego4d

Ego4D is a collaborative project, seeking to advance the fundamental AI research needed for multimodal machine perception for first-person video understanding.