Open Source
Introducing Nymeria, a dataset for improving human motion prediction for AR and VR devices
December 20, 2024

At ECCV 2024, Reality Labs Research publicly released the Nymeria dataset. The dataset provides egocentric human motion in the wild at an unprecedented scale, capturing a broad spectrum of people engaging in everyday activities across varied locations. Today, we’re shining a light on this work and its potential implications for future wearables like VR/MR headsets, AI and AR glasses, and smart watches. The Nymeria dataset is available for download at projectaria.com/datasets/nymeria.

Nymeria: A human motion dataset grounded in everyday life

There’s a magical experience the first time you use a virtual or mixed reality headset. With six degrees of freedom, you can move freely through immersive environments, while motion tracked controllers or hand tracking let you interact with and manipulate digital objects. However, the magic can be interrupted when your avatar doesn’t match your physical movements.

As wearable technology like AI glasses and smart watches become more popular, new opportunities emerge to more accurately predict human body movement—which could result in tangible benefits for end user experiences. For instance, athletes could use this technology to track their workouts over time, people could better monitor their posture, and workers could identify and correct ergonomic issues.

Predicting the position of the human body from egocentric sensors (like those found in VR and MR devices) remains a technical challenge. That’s because human movement is complex, body types vary, and our current generation of devices are limited in their ability to fully capture the wearer’s body. While advancements in sensors and analytical techniques are promising for improving human body prediction, a significant hurdle remains: the lack of comprehensive research datasets.

This motivated Reality Labs Research to develop and release the Nymeria dataset—a step forward to bridge the gap and accelerate research in egocentric human motion understanding with 300 hours of multimodal egocentric daily motion captured in natural settings.

Building the largest multimodal egocentric human motion dataset

Unlike previously existing datasets for human motion modeling, the Nymeria dataset captures in-the-wild human motion with multiple multimodal egocentric devices using Project Aria glasses and miniAria wristbands. This constellation of multimodal sensors approximate the types of signals that future wearable devices, like AI glasses and smart watches, might utilize. In-the-wild motion capture empowers researchers to build next-generation technology to assist daily human activities.

Representing the rich diversity of everyday life

To facilitate human motion modeling that can work across a broad range of people and locations, volunteer research participants were recruited from diverse backgrounds and demographics. Each participant was instructed to perform a set of 20 scenarios, such as cooking dinner, playing sports, or hanging out with friends, from different indoor and outdoor environments. With predefined unscripted scenarios, researchers can understand how different people perform the same activities, ensuring that future methods for human motion understanding are accessible and available to everyone.

Enriching body motion with language to accelerate physical-world AI assistants

The Nymeria dataset is designed to bridge the gap between motion and natural language. The dataset includes in-context descriptions of human motion from human annotators. By enriching the data with coarse-to-fine multi-level narrations, researchers can model human motion, actions, and activities at different granularity with context, explore advanced techniques with powerful LLMs like Llama, and build better user-friendly solutions.

While text-based AI assistants have already been shown to be valuable, a significant gap still exists in their ability to understand user context and respond appropriately. The Nymeria dataset represents a crucial step towards addressing this challenge, as it provides researchers with a rich source of data to explore the technical, privacy, and societal implications of developing such systems in a realistic and responsible manner.

Empowering research

As a case study, Reality Labs Research used the Nymeria dataset to develop novel ML models for egocentric motion understanding. Egocentric body motion provides rich context about the wearer, which helps future personalized AI assistants to make contextually relevant suggestions as you go through your day. The camera arrays on today’s smart glasses are biased towards capturing the user’s field-of-view and are not positioned in a way that easily captures the wearer’s own body motion. Given this limitation, egocentric body motion is ill-posed for many scenarios. Leveraging the scale of the Nymeria dataset, researchers at Reality Labs developed HMD2—a method to track wearers’ egocentric full-body motion from a single pair of Project Aria glasses. With a data-driven approach, researchers were able to model the ambiguous motion states with a probabilistic inference, while further collapsing the distribution whenever self-observations become available. Similarly to HMD2, the Nymeria dataset also facilitated EgoLM—a unified multimodal learning framework to model body motion and activity with natural language, where the raw sensor measurements from smart glasses are used to drive multiple tasks, from body tracking and motion synthesis to context understanding.

We believe the Nymeria dataset provides unique research opportunities in building next-generation AR/VR and contextual AI technology. By releasing the dataset for research, we hope to enable and inspire researchers to develop AI models with a strong ethical foundation, ultimately unlocking the full potential of AI systems to benefit society as a whole.

Learn more about Project Aria
Download the Nymeria dataset

Share:

Our latest updates delivered to your inbox

Subscribe to our newsletter to keep up with Meta AI news, events, research breakthroughs, and more.

Join us in the pursuit of what’s possible with AI.

Related Posts
Computer Vision
Introducing Segment Anything: Working toward the first foundation model for image segmentation
April 5, 2023
FEATURED
Research
MultiRay: Optimizing efficiency for large-scale AI models
November 18, 2022
FEATURED
ML Applications
MuAViC: The first audio-video speech translation benchmark
March 8, 2023