Open Source

EgoMimic: Georgia Tech PhD student uses Project Aria Research Glasses to help train humanoid robots

February 19, 2025

Today, we’re highlighting new research from Georgia Tech that helps train robots to perform basic everyday tasks using egocentric recordings from wearers of Meta’s Project Aria research glasses. Check out the video below, read the full story, or apply for your own Project Aria Research Kit.

Imagine having help completing everyday tasks in your home such as doing the laundry, washing dishes, and making repairs. We already use tools to help with these tasks, like washing machines, dishwashers, and electric drills. But what if you could have an even more powerful and flexible tool in the form of a humanoid robot that could learn from you and accelerate any number of physical projects on your to-do list?

Even if you had the available hardware system, teaching a robot to do everyday tasks can only be achieved through a slow and clunky data collection method called robot teleoperation. Until now. By using the Project Aria Research Kit, Professor Danfei Xu and the Robotic Learning and Reasoning Lab at Georgia Tech use the egocentric sensors on Aria glasses to create what they call “human data” for tasks that they want a humanoid robot to replicate. They use human data to dramatically reduce the amount of robot teleoperation data needed to train a robot’s policy—a breakthrough that could some day make humanoid robots capable of learning any number of tasks a human could demonstrate.

Kareer teleoperates the robot to capture co-training data for EgoMimic. Teleoperation can be difficult to scale and require significant human effort.

“Traditionally, collecting data for robotics means creating demonstration data,” says Simar Kareer, a PhD student in Georgia Tech’s School of Interactive Computing. “You operate the robot’s joints with a controller to move it and achieve the task you want, and you do this hundreds of times while recording sensor data, then train your models. This is slow and difficult. The only way to break that cycle is to detach the data collection from the robot itself.”

Today, robot policy models are trained with large amounts of targeted demonstration data specific to each narrow task at a high cost. Kareer hypothesizes that passively collected data from many researchers, like the data captured by Aria glasses, could instead be used to enable data creation for a much broader set of tasks to create more generally useful robots in the future.

Inspired by Project Aria and Ego-Exo4D which includes a massive egocentric dataset of over 3K hours of video recordings of daily-life activities, Kareer developed EgoMimic, a new algorithmic framework that utilizes human data and robot data for humanoid robot development.

“When I looked at Ego4D, I saw a dataset that’s the same as all the large robot datasets we’re trying to collect, except it’s with humans,” Kareer explains. “You just wear a pair of glasses, and you go do things. It doesn’t need to come from the robot. It should come from something more scalable and passively generated, which is us.” In Kareer’s research, Aria glasses were used to create human data for co-training the EgoMimic framework.

Kareer creates co-training human data by recording with Aria glasses while folding a t-shirt.

Aria glasses aren’t just used for human data collection in Georgia Tech’s research. They’re also used as an integral component of the robot’s real-time operation setup. Aria glasses are mounted to their humanoid robot platform just like a pair of eyes and serve as an integrated sensor package that enables the robot to perceive its environment in real time. The Aria Client SDK is utilized to stream Aria’s sensor data directly into the robot’s policy, running on an attached PC, which in turn controls the robot’s actuation. Using Aria glasses for both the data collection and the real-time perception pipeline minimizes the domain gap between the human demonstrator and the robot, paving the way for scaled human data generation for future robotics task training.

Aria glasses mounted to the top of the robot provide the system with sensor data that allows the robot to perceive and interact with the space.

Thanks to EgoMimic, Kareer achieved a 400% increase in his robot’s performance across various tasks vs previous methods with just 90 minutes of Aria recordings. The robot was also able to successfully perform these tasks in previously unseen environments.

In the future, humanoid robots could be trained at scale using egocentric data in order to perform a variety of tasks in the same way humans do.

“We look at Aria as an investment in the research community,” says James Fort, a Reality Labs Research Product Manager at Meta. “The more that the egocentric research community standardizes, the more researchers will be able to collaborate. It’s really through scaling with the community like this that we can start to solve bigger problems around how things are going to work in the future.”

Kareer will present his paper on EgoMimic at the 2025 IEEE Engineers’ International Conference on Robotics and Automation (ICRA) in Atlanta.


Share:

Our latest updates delivered to your inbox

Subscribe to our newsletter to keep up with Meta AI news, events, research breakthroughs, and more.

Join us in the pursuit of what’s possible with AI.

Related Posts
Computer Vision
Introducing Segment Anything: Working toward the first foundation model for image segmentation
April 5, 2023
FEATURED
Research
MultiRay: Optimizing efficiency for large-scale AI models
November 18, 2022
FEATURED
ML Applications
MuAViC: The first audio-video speech translation benchmark
March 8, 2023