Learn about the research for the most advanced media foundation AI models and see how Meta is partnering with creators and directors to enable immersive storytelling.
Movie Gen sets a new standard for immersive AI content
Our latest research breakthroughs demonstrate how you can use simple text inputs to produce custom videos and sounds, edit existing videos or transform your personal image into a unique video.
Text input summary: A girl is running across a beach and holding a kite. She's wearing jean shorts and a yellow t-shirt. The sun is shining down.
Text input summary: A woman is sitting on the grass of a pumpkin patch. She is wearing a scarf and holding a cup. The background is filled with rows of pumpkins.
Text input: Thunder cracks loudly, with an orchestral music track.
Text input: Transform the lantern into a bubble that soars into the air.
Generate videos from text
Movie Gen uses text input to create long high-definition videos at different aspect ratios—the first of its kind in the industry.
Text input summary: A sloth with pink sunglasses lays on a donut float in a pool. The sloth is holding a tropical drink. The world is tropical. The sunlight casts a shadow.
Text input summary: The camera is behind a man. The man is shirtless, wearing a green cloth around his waist. He is barefoot. With a fiery object in each hand, he creates wide circular motions. A calm sea is in the background. The atmosphere is mesmerizing, with the fire dance.
Text input summary: A fluffy koala bear surfs. It has a grey and white coat and a round nose. The surfboard is yellow. The koala bear is holding onto the surfboard with its paws. The koala bear’s facial expression is focused. The sun is shining.
Text input summary: A ghost in a white bedsheet faces a mirror. The ghost's reflection can be seen in the mirror. The ghost is in a dusty attic, filled with old beams, cloth-covered furniture. The attic is reflected in the mirror. The light is cool and natural. The ghost dances in front of the mirror.
Text input summary: A red-faced monkey with white fur is bathing in a natural hot spring. The monkey is playing in the water with a miniature sail ship in front of it, made of wood with a white sail and a small rudder. The hot spring is surrounded by lush greenery, with rocks and trees.
Edit video with text
Movie Gen transforms existing videos with text inputs, enabling precise video editing for styles, transitions, fine-grained edits and more.
Produce personalized videos
When you upload a photo of yourself and give simple text inputs, Movie Gen can create personalized videos that preserve human identity and motion.
Text input summary: A man is doing a scientific experiment in a lab with rainbow wallpaper. The man has a serious expression and is wearing glasses. He is wearing a white lab coat with a pen in the pocket. The man pours liquid into a glass beaker and a cloud of white smoke blooms.
Text input summary: A woman paints a canvas on an easel, in a wood-paneled room. The woman is wearing a white shirt. She has a calm expression as she concentrates on her work. A baby bear cub stands at her feet. The lighting is cool.
Text input summary: Make a cute selfie video of a man and his dog. The man is wearing a black shirt. The dog is a beagle puppy. The background is a backyard patio, filled with trees. The man has a big smile on his face, as he tries to take the perfect selfie with his dog. The lighting is warm.
Text input summary: A man sits in the desert, wearing a wide-brimmed hat, a brown coat, and a scarf. The man holds a glass of amber-colored tea. The camera pans from the desert scenery to the person. The lighting is warm, with the sun casting a gentle glow on the scene.
Text input summary: A cowgirl wearing denim pants is on a white horse in an old western town. A leather belt cinches at her waist. The horse is majestic, with its coat gleaming in the sunlight. The Rocky Mountains are in the background.
Text input summary: A woman DJ spins records on a rooftop in LA. She is wearing a pink jacket and giant headphones. There is a cheetah next to the woman. The background is a cityscape.
Create sound effects and soundtracks
Movie Gen allows you to use text inputs to create and extend sound effects, background music or entire soundtracks that reflect the tone, rhythm and style of your video.
Text input: Rain pours against the cliff and the person, with music playing in the background.
Text input: Rustling leaves and snapping twigs, with an orchestral music track.
Text input: ATV engine roars and accelerates, with guitar music.
Text input: Wheels spinning, and a slamming sound as the skateboard lands on concrete.
Text input: A beautiful orchestral piece that evokes a sense of wonder.
Text input: Whistling sounds, followed by a sharp explosion and loud crackling.
CREATIVE INDUSTRY FEEDBACK PROGRAM
Partnering with award-winning storytellers
Blumhouse, an award-winning production company, selected filmmakers to create videos with Movie Gen before its public debut. Watch this video to learn about their experience and what opportunities our AI media foundation models will offer to the creative community.
Immersive storytelling with Movie Gen
Blumhouse’s filmmaker partners were asked to try our suite of AI media tools to make something they found interesting or useful. Check out this video from director Aneesh Chaganty titled "i h8 ai."
@paigepiskin
Text input summary #1: A close up of a girl's hands holding a small fluffy kitten-face tarantula.
Text input summary #2: Change the dog into a grey baby dragon with grey wings and orange eyes.
@ka5sh
Text input summary #1: A green cartoon alien is waving with his hand and wears pink clown shoes.
Text input summary #2: Turn person into green alien with a red bucket hat.
@girls
Text input summary #1: A smiling girl walks down a path of autumn trees with a basket in her hand.
Text input summary #2: Two women are sipping coffee, as Halloween decorations hang on the wall."
@memezar
Text input summary: A cute baby hippo and a muscular gorilla in a boxing match. The hippo wears a red uniform and the gorilla wears it in blue. The match has a cheering crowd in the arena.
@ravivora
Text input #1: Add thick fog to the foreground.
Text input summary #2: A young woman with long hair swims up to the surface, as jellyfish surround her.
Read our latest research paper to learn how we’ve set new industry benchmarks on media generation with AI.
Foundational models
Latest news
Foundational models