April 13, 2023
From a young age, people express themselves and their creativity through drawing. We created an AI system research demo to easily bring artwork to life through animation, and we are now releasing the animation code along with a novel dataset of nearly 180,000 annotated amateur drawings to help other AI researchers and creators to innovate further. To our knowledge, this is the first annotated dataset to feature this kind of artwork.
Drawing is a near-universal way for people to capture a character, scene, or idea quickly. But while the content or meaning of a drawing is often clear to other human observers, an abstract or non-realistic appearance can make a drawing incomprehensible to AI models trained on images of real-life objects. To teach AI to recognize all the different ways someone might draw a humanlike figure would require a large dataset of sketches from budding artists. With the new dataset we are sharing today (described in detail in this research paper), researchers and practitioners can build tools to more easily and accurately analyze the contents of amateur drawings. And this can unlock new digital-physical hybrid experiences, such as new forms of storytelling and greater accessibility in art.
When we released our Animated Drawings Demo in late 2021, we invited people to opt in to contribute to a dataset of in-the-wild amateur drawings. The browser-based demo allowed people to upload images, verify or fix a few annotation predictions, and receive a short animation of their humanlike character within their drawing. More than 3.2 million people from around the world visited the site, including those who posted on social media about their creations. In total, 6.7 million images were uploaded to the demo. The drawings were created, photographed, and shared with Meta by participants in a de-identified manner. Human reviewers then filtered a subset of images that people had chosen to share with our research team.
Prior to releasing the Amateur Drawings Dataset, we performed several levels of filtration to ensure a high level of quality and implemented privacy safeguards, which are described in detail in our research paper.
While our demo allows for only a limited set of movements, many users of the Animated Drawings Demo provided feedback requesting more features, such as multiple characters, additional actions, smiling, blinking, and gazing cues. The GIF with dancing figures (see above) is an example of expanding upon the open source code and dataset for other creative and educational purposes. With these resources, other researchers can add to our methods of analyzing and augmenting amateur drawings to expand upon the original demo features.
The range of figure drawings is as wide as any person’s imagination. How do you train a model to perform well in the presence of such variation? One way would be to train new models using annotated drawings. However, such drawings are difficult to find in the numbers needed to train a neural network. Another approach would be to create the drawings synthetically. This is problematic as well. Generative methods require a large set of sample data to learn from, and style transfer methods (e.g., creating a “colored pencil” rendering of a photograph) may not capture all the nuanced ways in which a drawing differs from a photo. In addition, creating data synthetically may not capture all the relevant sources of nuisance variation actually seen in in-the-wild photographs of amateur drawings, such paper creases, erased lines, light glare, and shadows.
We structured the task of generating an animation from a single drawing of a figure as a series of subtasks: human figure detection, segmentation, pose estimation, and animation.
After someone uses our demo to upload a drawing, they have the option to adjust the detected bounding box, segmentation mask, and joint locations, and choose an action to animate.
Our system incorporates repurposed computer vision models trained on photographs of real-world objects. Because the domain of drawings, including that of children, is significantly different in appearance, we fine-tune the models using the Amateur Drawings Dataset.
With this dataset and animation code, we believe that the domain of amateur drawings can inspire a new generation of creators with its expressive and accessible possibilities. We hope they will be an asset to other researchers interested in exploring potential applications for their work.
For those in the AI community targeting any tool or algorithm that uses pen-and-paper drawings, this dataset is distinctive for its size and in-the-wild nature: it reflects real-world conditions (e.g., blurriness, hard shadows, crinkled surfaces, and background elements) that aren’t present in digital drawings and high-resolution scans. In addition to the images, the dataset includes annotations of bounding boxes, segmentation masks, and joint locations — features that could provide more ways for models to identify or animate drawn figures.
Here’s how we built the dataset. As part of the demo, people had the option to let us retain their uploaded image and annotations to be included in our ongoing research. As researchers, we respect the right of individuals to be cautious about sharing their data, and we wanted people to be able to animate their drawings either way. The data collection process was also designed with safety in mind. In doing so, we aimed to reduce the potential for misuse of the data as much as possible.
We also filtered the submitted images to ensure that they showed amateur drawings and met our standards for collecting research data responsibly. We performed this refinement in two steps. First, we used a self-supervised clustering approach to identify and filter out-of-domain images, such as photographs of real people. Second, a contracted agency manually reviewed the remaining images to ensure that they met our standards. Reviewers were instructed to check that images were freehand drawings on paper, with at least one full-body humanlike figure. They also checked to make sure images did not contain characters that were protected intellectual property or any private or vulgar content. Because the reviewers were primarily English speakers, images that contained non-English words were excluded on the basis that they might contain inappropriate content.
In keeping with our approach to open science, we are sharing the animation pipeline code and this dataset in hope that it will be of interest to other practitioners – both AI researchers and members of the broader research community.
Drawing is a natural and expressive modality that is accessible to most of the world’s population. We hope our work will make it easier for other researchers to explore tools and techniques specifically tailored to using AI to complement human creativity.
We’d like to thank the FAIR Interfaces for their assistance in creating the original demo.
We’re announcing updates to Facebook’s population density maps, which can be used to coordinate and improve the delivery of humanitarian aid around the world, including global COVID-19 vaccinations.
April 15, 2021
Working with Inria researchers, we’ve developed a self-supervised image representation method, DINO, which produces remarkable results when trained with Vision Transformers. We are also detailing PAWS, a new method for 10x more efficient training.
April 30, 2021