October 31, 2024
We present a benchmark for Planning And Reasoning Tasks in humaN-Robot collaboration (PARTNR) designed to study human-robot coordination in household activities. PARTNR tasks exhibit characteristics of everyday tasks, such as spatial, temporal, and heterogeneous agent capability constraints. We employ a semi-automated task generation pipeline using Large Language Models (LLMs), incorporating simulation in the loop for grounding and verification. PARTNR stands as the largest benchmark of its kind, comprising 100,000 natural language tasks, spanning 60 houses and 5,819 unique objects. We analyze state-of-the-art LLMs on PARTNR tasks, across the axes of planning, perception and skill execution. The analysis reveals significant limitations in SoTA models, such as poor coordination and failures in task tracking and recovery from errors. When LLMs are paired with real humans, they require 1.5x as many steps as two humans collaborating and 1.1x more steps than a single human, underscoring the potential for improvement in these models. We further show that fine-tuning smaller LLMs with planning data can achieve performance on par with models 9 times larger, while being 8.6x faster at inference. Overall, PARTNR highlights significant challenges facing collaborative embodied agents and aims to drive research in this direction.
Written by
Matthew Chang
Gunjan Chhablani
Alexander William Clegg
Mikael Dallaire Cote
Michal Hlavac
Vladimir Karashchuk
Jacob Krantz
Roozbeh Mottaghi
Priyam Parashar
Siddharth Patki
Ishita Prasad
Xavi Puig
Ram Ramrakhya
Daniel Tran
Joanne Truong
John Turner
Eric Undersander
Jimmy Yang
Publisher
ArXiv
Research Topics
Robotics
October 31, 2024
Mike Lambeta, Tingfan Wu, Ali Sengül, Victoria Rose Most, Nolan Black, Kevin Sawyer, Romeo Mercado, Haozhi Qi, Alexander Sohn, Byron Taylor, Norb Tydingco, Gregg Kammerer, Dave Stroud, Jake Khatha, Kurt Jenkins, Kyle Most, Neal Stein, Ricardo Chavira, Thomas Craven-Bartle, Eric Sanchez, Yitian Ding, Jitendra Malik, Roberto Calandra
October 31, 2024
October 31, 2024
Carolina Higuera, Akash Sharma, Krishna Bodduluri, Taosha Fan, Patrick Lancaster, Mrinal Kalakrishnan, Michael Kaess, Byron Boots, Mike Lambeta, Tingfan Wu, Mustafa Mukadam
October 31, 2024
May 06, 2024
Ben Newman, Christopher Paxton, Kris Kitani, Henny Admoni
May 06, 2024
April 02, 2024
Patrick Lancaster, Nicklas Hansen, Aravind Rajeswaran, Vikash Kumar
April 02, 2024
Foundational models
Latest news
Foundational models