November 10, 2023
First-person video highlights a camera-wearer's activities in the context of their persistent environment. However, current video understanding approaches reason over visual features from short video clips that are detached from the underlying physical space and capture only what is immediately visible. To facilitate human-centric environment understanding, we present an approach that links egocentric video and the environment by learning representations that are predictive of the camera-wearer's (potentially unseen) local surroundings. We train such models using videos from agents in simulated 3D environments where the environment is fully observable, and test them on human-captured real-world videos from unseen environments. On two human-centric video tasks, we show that models equipped with our environment-aware features consistently outperform their counterparts with traditional clip features. Moreover, despite being trained exclusively on simulated videos, our approach successfully handles real-world videos from HouseTours and Ego4D, and achieves state-of-the-art results on the Ego4D NLQ challenge. Project page: https://vision.cs.utexas.edu/projects/ego-env/
Publisher
NeurIPS
Research Topics
December 12, 2024
Melissa Hall, Oscar MaƱas, Reyhane Askari, Mark Ibrahim, Candace Ross, Pietro Astolfi, Tariq Berrada Ifriqi, Marton Havasi, Yohann Benchetrit, Karen Ullrich, Carolina Braga, Abhishek Charnalia, Maeve Ryan, Mike Rabbat, Michal Drozdzal, Jakob Verbeek, Adriana Romero Soriano
December 12, 2024
December 11, 2024
Pierre Fernandez, Hady Elsahar, Zeki Yalniz, Alexandre Mourachko
December 11, 2024
December 11, 2024
Hu Xu, Bernie Huang, Ellen Tan, Ching-Feng Yeh, Jacob Kahn, Christine Jou, Gargi Ghosh, Omer Levy, Luke Zettlemoyer, Scott Yih, Philippe Brunet, Kim Hazelwood, Ramya Raghavendra, Daniel Li (FAIR), Saining Xie, Christoph Feichtenhofer
December 11, 2024
December 11, 2024
Narine Kokhlikyan, Bargav Jayaraman, Florian Bordes, Chuan Guo, Kamalika Chaudhuri
December 11, 2024
Foundational models
Latest news
Foundational models