AUGUST 31, 2023

FACET Dataset

FACET is a comprehensive benchmark dataset designed for measuring or evaluating the robustness and algorithmic fairness of AI and machine-learning vision models for protected groups.

The dataset was introduced in our paper “FACET: Fairness in Computer Vision Evaluation Benchmark”.


FACET consists of 32K diverse, high-resolution, privacy protecting images labeled across 13 person attributes and 52 person classes.

We partnered with expert annotators to label person-related attributes (e.g., perceived gender presentation, perceived skin tone, hairstyle) and person-related classes (e.g., basketball player, doctor) for the 32K images. The FACET dataset is intended to be used for computer vision research for the purposes permitted under our Data License.

The images are licensed from a large photo company. The annotations were collected by expert annotators. Please refer to the paper for more details on how annotations were collected.


Key Application

Computer Vision, Fairness and Robustness

Intended Use Cases
  • Research purposes only

  • Measure or evaluate the robustness and algorithmic fairness of AI and machine-learning vision models for protected groups.

Primary Data Type


Data Function

Measuring or Evaluating Fairness, model evaluation

Dataset Characteristics

Total number of images: 32K

Total number of subjects: 50K

Average image resolution: 1500×2000 pixels


Size: 16 person attributes and 52 person classes

Evaluation Annotations

  • Bounding boxes around each person : 50K

  • 52 person-related classes : 50K

  • Person, clothing, hair labels for SA-1B masks : 69K

Protected Groups

  • Perceived skin tone : 50K

  • Perceived age group : 50K

  • Perceived gender presentation : 50K

Additional person attributes

  • Hair: color, type, facial hair : 50K

  • Accessories: headscarf, facemask, hat : 50K

  • Other: tattoo : 50K

Miscellaneous attributes

  • Lighting condition, level of occlusion : 50K

Nature Of Content

The images are a subset of SA-1B. The annotations are human-labels for pre-defined categories.


Limited; see full license language for use

Summary of license permissions

The rights to the licensed images and annotations are granted for evaluation purposes only, meaning for the purposes of measuring or evaluating the robustness and algorithmic fairness of AI and machine-learning vision models, and solely on a non-commercial and research basis.

Access Cost

Open access

Data Collection

Data sources

The images are a subset of SA-1B. Annotations collected by expert annotators from a third party vendor.

Data selection

Images were selected based on their content.

Sampling Methods

Images were sampled from SA-1B based on their content.

Geographic distribution

Exact geographic distribution unknown.

Approximate geographic distribution:

Labeling Methods

Human Labels. More details in the FACET paper and Data Card.

Label types

Human-labels: pre-defined text labels

Labeling procedure - Human

We collected annotations from expert human annotators. See the CrowdWorkSheets in FACET paper for more information about the annotation collection process.

Validation Methods

Human validated

Validator description(s)

Human validated

Validation tasks

Human validators filter images for annotation

Human validators verify labels

Validation policy summary

Labels were verified by human validators

Validators flagged objectionable content

Please email to report any issues.