META FAIR RESEARCH
We are introducing a family of audiovisual behavioral motion models, which are compatible with both 2D and 3D renderings and trained on the Seamless Interaction Dataset.
RESEARCH OVERVIEW
Advancing AI research modeling of face-to-face dynamics, including expressive gestures, active listening, turn-taking and visual synchrony.
RESEARCH CAPABILITY
Our generative motion models that have been trained on dual sided conversations, can generate synchronous reactions, body gestures and facial expressions as seen in the examples below.
Research model output: Our motion model generates faces and body gestures that match the flow of conversation. Watch as the AI-generated individual on the right side uses hand gestures in sync with their words, like when saying "chill".
Research model output: Observe how the AI-generated individual on the right side raises their hands when their face looks up, emphasizing an expressive point.
Research model output: The AI-generated individual on the left side actively listens, nodding and maintaining eye contact while backchanneling.
Research model output: Our motion model captures the dynamic interplay of gestures and facial expressions that unfold throughout a conversation.
Our dyadic models are able to react to visual inputs and offer controllability in facial expressiveness.
Controllability
Research model output: Here are two versions of the same avatar, with the individual on the right side exhibiting greater expressiveness compared to the one on the left side.
Research model output: Notice how the more expressive avatar's smile is larger and has more active head nodding.
Visual input
Research model output: The responder reacts to the initiator side's playful wink.
Research model output: The responder responds to the initiator's expression of surprise.
RENDERING COMPATIBILITY
The outputs of our dyadic motion models are compatible with 2D and 3D renderings.
Visual rendering in 3D
Visual rendering in 2D photorealistic style
Visual rendering in 3D
Visual rendering in 3D
Visual rendering in 2D photorealistic style
DATASET
Seamless Interaction Dataset
The Seamless Interaction Dataset comprises over 4,000 hours of full-body, in-person, human face-to-face interaction videos. All our dyadic motion models were trained using this dataset.
4000+ Human participants
4,000+ participants, featuring naturalistic conversations between familiar pairs and professional actors.
65k+ Interactions
65,000 individual interactions spanning from casual to intense moments.
4000+ Hours
4,000+ hours of dyadic conversations, highlighting the breadth of conversational dynamics.
5000+ Annotated samples
5,000+ detailed annotations capturing self-described internal emotional states and visual behaviors.
1300+ Unique prompts
1,300+ unique interaction scenarios based on established psychological theory.
4K Video recordings
Videos recorded in 4K resolution.
We explore dyadic motion modeling and its potential to transform the way we interact with AI systems, enabling more nuanced, expressive, and human-like interactions.
Our approach
Latest news
Foundational models