June 18, 2018
We present a new AI task – Embodied Question Answering (EmbodiedQA) – where an agent is spawned at a random location in a 3D environment and asked a question (‘What color is the car?’). In order to answer, the agent must first intelligently navigate to explore the environment, gather necessary visual information through first-person (egocentric) vision, and then answer the question (‘orange’). EmbodiedQA requires a range of AI skills – language understanding, visual recognition, active perception, goal driven navigation, commonsense reasoning, long-term memory, and grounding language into actions. In this work, we develop a dataset of questions and answers in House3D environments, evaluation metrics, and a hierarchical model trained with imitation and reinforcement learning.
April 17, 2025
Ansong Ni, Ruta Desai, Yang Li, Xinjie Lei, Dong Wang, Ramya Raghavendra, Gargi Ghosh, Daniel Li (FAIR), Asli Celikyilmaz
April 17, 2025
April 16, 2025
Paul McVay, Sergio Arnaud, Ada Martin, Arjun Majumdar, Krishna Murthy Jatavallabhula, Phillip Thomas, Ruslan Partsey, Daniel Dugas, Abha Gejji, Alexander Sax, Vincent-Pierre Berges, Mikael Henaff, Ayush Jain, Ang Cao, Ishita Prasad, Mrinal Kalakrishnan, Mike Rabbat, Nicolas Ballas, Mido Assran, Oleksandr Maksymets, Aravind Rajeswaran, Franziska Meier
April 16, 2025
April 14, 2025
Yeongmin Kim, Sotiris Anagnostidis, Yuming Du, Edgar Schoenfeld, Jonas Kohler, Markos Georgopoulos, Albert Pumarola, Ali Thabet, Artsiom Sanakoyeu
April 14, 2025
March 24, 2025
Wassim (Wes) Bouaziz, Nicolas Usunier, El Mahdi El Mhamdi
March 24, 2025
April 08, 2021
Caner Hazirbas, Joanna Bitton, Brian Dolhansky, Jacqueline Pan, Albert Gordo, Cristian Canton Ferrer
April 08, 2021
April 30, 2018
Tomer Galanti, Lior Wolf, Sagie Benaim
April 30, 2018
April 30, 2018
Yedid Hoshen, Lior Wolf
April 30, 2018
December 11, 2019
Eliya Nachmani, Lior Wolf
December 11, 2019
Foundational models
Our approach
Latest news
Foundational models