Segment Anything 1 Billion (SA-1B) is a dataset designed for training general-purpose object segmentation models from open world images. The dataset was introduced in our paper “Segment Anything”.
SA-1B consists of 11M diverse, high-resolution, privacy protecting images and 1.1B high-quality segmentation masks that were collected with our data engine. It is intended to be used for computer vision research for the purposes permitted under our Data License.
The images are licensed from a large photo company. The 1.1B masks were produced using our data engine, all of which were generated fully automatically by the Segment Anything Model (SAM). Please refer to the paper for more details on the mask generation process.
Computer Vision, Segmentation
Research purposes only
Train and evaluate generic object segmentation models
Allow access to a privacy protecting and copyright friendly large-scale image dataset
Images, Mask annotations
Training, testing
Total number of images: 11M
Total number of masks: 1.1B
Average masks per image: 100
Average image resolution: 1500×2250 pixels
NOTE: There are no class labels for the images or mask annotations.
Labels
Class agnostic mask annotations
The underlying images are licensed from a large photo company. The images vary in subject matter. Common themes of the images include: locations, objects, scenes. Masks range from large scale objects such as buildings to fine grained details such as door handles.
Faces and license plates de-identified
Limited; see full license language for use
Open access
Data sources
Images licensed from a photo company.
Masks generated by the Segment Anything Model (SAM).
Data selection
Images were selected based on their content. The images are photos taken from a camera, i.e. not artwork.
Unsampled
Geographic distribution
Automatically generated masks (more details in the Segment Anything paper)
Label types
Masks are provided in the COCO run-length encoding (RLE) annotation format
Labeling procedure - Automatic
The final mask annotations we are releasing were generated automatically. To train the model used for automatic annotation, we first collected mask annotations from expert human annotators using an interactive model in the loop process. Please refer to our paper for more details.
A random sample of mask annotations were reviewed and validated by human annotators.
Please email segment-anything@meta.com or report any issues via the feedback form on our website segment-anything.com
Foundational models
Latest news
Foundational models