GRAPHICS

COMPUTER VISION

Make-An-Animation: Large-Scale Text-conditional 3D Human Motion Generation

August 10, 2023

Abstract

Text-guided human motion generation has drawn significant interest because of its impactful applications spanning animation and robotics. Recently, application of diffusion models for motion generation has enabled improvements in the quality of generated motions. However, existing approaches are limited by their reliance on relatively small-scale motion capture data, leading to poor performance on more diverse, in-the-wild prompts. In this paper, we introduce Make-An-Animation, a text-conditioned human motion generation model which learns more diverse poses and prompts from large-scale image-text datasets, enabling significant improvement in performance over prior works. Make-An-Animation is trained in two stages. First, we train on a curated large-scale dataset of (text, static pseudo-pose) pairs extracted from image-text datasets. Second, we fine-tune on motion capture data, adding additional layers to model the temporal dimension. Unlike prior diffusion models for motion generation, Make-An-Animation uses a U-Net architecture similar to recent text-to-video generation models. Human evaluation of motion realism and alignment with input text shows that our model reaches state-of-the-art performance on text-to-motion generation.

Download the Paper

AUTHORS

Written by

Samaneh Azadi

Akbar Shah

Devi Parikh

Sonal Gupta

Thomas Hayes

Publisher

ICCV

Research Topics

Graphics

Computer Vision

Related Publications

May 06, 2024

REINFORCEMENT LEARNING

COMPUTER VISION

Solving General Noisy Inverse Problem via Posterior Sampling: A Policy Gradient Viewpoint

Haoyue Tang, Tian Xie

May 06, 2024

April 23, 2024

COMPUTER VISION

Very high resolution canopy height maps from RGB imagery using self-supervised vision transformer and convolutional decoder trained on Aerial Lidar

Jamie Tolan, Eric Yang, Ben Nosarzewski, Guillaume Couairon, Huy Vo, John Brandt, Justine Spore, Sayantan Majumdar, Daniel Haziza, Janaki Vamaraju, Theo Moutakanni, Piotr Bojanowski, Tracy Johns, Brian White, Tobias Tiecke, Camille Couprie, Edward Saenz

April 23, 2024

April 23, 2024

CONVERSATIONAL AI

GRAPHICS

Generating Illustrated Instructions

Sachit Menon, Ishan Misra, Rohit Girdhar

April 23, 2024

April 18, 2024

COMPUTER VISION

Imagine Flash: Accelerating Emu Diffusion Models with Backward Distillation

Jonas Kohler, Albert Pumarola, Edgar Schoenfeld, Artsiom Sanakoyeu, Roshan Sumbaly, Peter Vajda, Ali Thabet

April 18, 2024

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.