FACET is a comprehensive benchmark dataset designed for measuring or evaluating the robustness and algorithmic fairness of AI and machine-learning vision models for protected groups.
The dataset was introduced in our paper “FACET: Fairness in Computer Vision Evaluation Benchmark”.
FACET consists of 32K diverse, high-resolution, privacy protecting images labeled across 13 person attributes and 52 person classes.
We partnered with expert annotators to label person-related attributes (e.g., perceived gender presentation, perceived skin tone, hairstyle) and person-related classes (e.g., basketball player, doctor) for the 32K images. The FACET dataset is intended to be used for computer vision research for the purposes permitted under our Data License.
The images are licensed from a large photo company. The annotations were collected by expert annotators. Please refer to the paper for more details on how annotations were collected.
Computer Vision, Fairness and Robustness
Research purposes only
Measure or evaluate the robustness and algorithmic fairness of AI and machine-learning vision models for protected groups.
Measuring or Evaluating Fairness, model evaluation
Total number of images: 32K
Total number of subjects: 50K
Average image resolution: 1500×2000 pixels
Size: 16 person attributes and 52 person classes
Bounding boxes around each person : 50K
52 person-related classes : 50K
Person, clothing, hair labels for SA-1B masks : 69K
Perceived skin tone : 50K
Perceived age group : 50K
Perceived gender presentation : 50K
Additional person attributes
Hair: color, type, facial hair : 50K
Accessories: headscarf, facemask, hat : 50K
Other: tattoo : 50K
Lighting condition, level of occlusion : 50K
The images are a subset of SA-1B. The annotations are human-labels for pre-defined categories.
Limited; see full license language for use
The rights to the licensed images and annotations are granted for evaluation purposes only, meaning for the purposes of measuring or evaluating the robustness and algorithmic fairness of AI and machine-learning vision models, and solely on a non-commercial and research basis.
The images are a subset of SA-1B. Annotations collected by expert annotators from a third party vendor.
Images were selected based on their content.
Images were sampled from SA-1B based on their content.
Exact geographic distribution unknown.
Approximate geographic distribution:
Human-labels: pre-defined text labels
Labeling procedure - Human
We collected annotations from expert human annotators. See the CrowdWorkSheets in FACET paper for more information about the annotation collection process.
Human validators filter images for annotation
Human validators verify labels
Labels were verified by human validators
Validators flagged objectionable content
Please email email@example.com to report any issues.