Resources

Datasets

Large-scale datasets and benchmarks for training, evaluating, and testing models to measure and advance AI progress.

Featured Dataset

FACET Dataset

FACET is a comprehensive benchmark dataset designed for measuring or evaluating the robustness and algorithmic fairness of AI and machine-learning vision models for protected groups.

Overview

Datasets

Speech Fairness Dataset

Designed for training general-purpose object segmentation models from open world images.

Segment Anything

Designed for training general-purpose object segmentation models from open world images.

Casual Conversations V2

For evaluating computer vision, audio and speech models for accuracy across a diverse set of ages, genders, language/dialects, geographies, disabilities, and more

Casual Conversations

For evaluating computer vision and audio models for accuracy across a diverse set of age, genders, apparent skin tones and ambient lighting conditions

Common Objects in 3D (CO3D)

For learning category-specific 3D reconstruction and new-view synthesis using multi-view images of common object categories

Deepfake Detection Challenge

Measures progress on deepfake detection technology

DISC21 Dataset

Helps researchers evaluate their image copy detection models for accuracy

EgoObjects Dataset

A project that seeks to advance the fundamental AI research needed for multi-modal machine perception for first-person video understanding

FLoRes Benchmarking Dataset

Used for machine translation between English and low-resource languages.

Ego4d

Ego4D is a collaborative project, seeking to advance the fundamental AI research needed for multimodal machine perception for first-person video understanding.