Open Source
Advancing embodied AI through progress in touch perception, dexterity, and human-robot interaction
October 31, 2024
11 minute read

Takeaways

  • Meta FAIR is publicly releasing several new research artifacts that advance robotics and support our goal of reaching advanced machine intelligence (AMI).
  • The work we’re sharing today includes advancements in touch perception, dexterity, and human-robot interaction, all critical ingredients on the path towards achieving AMI.
  • We’re also announcing strategic partnerships with GelSight Inc and Wonik Robotics to develop and commercialize tactile sensing innovations that enable easy access for the research community and help foster an open ecosystem for AI.


Understanding and interacting with the physical world—a crucial capability to accomplish everyday tasks—comes naturally to humans but is a struggle for today’s AI systems. Our Fundamental AI Research (FAIR) team is working to advance the creation of embodied AI agents with the robotics community that can perceive and interact with their surroundings as well as coexist safely with humans, providing assistance in both physical and virtual realms. We believe this is a critical step on the path toward advanced machine intelligence (AMI).

Today, we’re publicly releasing several new research artifacts that advance touch perception, robot dexterity, and human-robot interaction. Touch is the first and most crucial modality for humans to physically interact with the world. To enable AI to perceive what’s inaccessible through vision, we’re releasing Meta Sparsh, the first general-purpose touch representation that works across many sensors and many tasks; Meta Digit 360, a breakthrough tactile fingertip with human-level multimodal sensing capabilities; and Meta Digit Plexus, a standardized hardware-software platform to integrate various fingertip and skin tactile sensors onto a single robot hand. We believe these advancements have the potential to positively impact fields such as healthcare and manufacturing by enabling machines to perform complex dexterous tasks.

We’re also partnering with industry leaders GelSight Inc and Wonik Robotics to develop and commercialize these tactile sensing innovations. GelSight Inc will manufacture and distribute Digit 360, which will be available for purchase next year, and members of the research community can apply through the Digit 360 call for proposals to gain early access. Our partnership with Wonik Robotics is poised to create a new advanced dexterous robot hand, fully integrated with tactile sensing leveraging Meta Digit Plexus. Wonik Robotics will manufacture and distribute the next generation of the Allegro Hand, which is slated to launch next year. Researchers can fill out an interest form to stay updated on this release.

For robots to be truly useful, they must go beyond physical tasks and reason about social interactions. That's why we're introducing the PARTNR benchmark—a standardized framework for evaluating planning and reasoning in human-robot collaboration. PARTNR enables reproducible, large-scale assessments of embodied models, such as LLM-based planners, across diverse collaborative scenarios, incorporating physical-world constraints like time and space. With PARTNR, we aim to drive advancements in human-robot interaction and collaborative intelligence, transforming Al models from “agents” to “partners.”

Meta Sparsh: A new approach to exploring physical intelligence

We’re publicly releasing Sparsh, the first general-purpose encoder for vision-based tactile sensing. The name Sparsh, derived from the Sanskrit word for touch or contact sensory experience, aptly describes how digitized tactile signals can be processed by AI models to enable touch perception.

Vision-based tactile sensors come in various forms, differing in aspects like shape, lighting, and gel markings. Existing approaches rely on task- and sensor-specific handcrafted models. This is hard to scale since real data with labels, like forces and slip, can be prohibitive to collect. In contrast, Sparsh works across many types of vision-based tactile sensors across many tasks by leveraging advances in self-supervised learning (SSL), avoiding the need for labels. It’s a family of models pre-trained on a large dataset of over 460,000 tactile images.



For standardized evaluations across touch models, we introduce a novel benchmark consisting of six touch-centric tasks ranging from comprehending tactile properties to enabling physical perception and dexterous planning. We find that Sparsh outperforms task- and sensor-specific models by an average of over 95% on this benchmark. By enabling pre-trained backbones for tactile sensing, we aim to empower the community to build on and scale these models towards innovative applications in robotics, AI, and beyond.

Read the paper

Download the code

Download the models and dataset

Meta Digit 360: An artificial fingertip with human-level tactile sensing

We’re excited to introduce Digit 360, an artificial finger-shaped tactile sensor that delivers rich and detailed tactile data by digitizing touch with human-level precision. Equipped with over 18 sensing features, Digit 360 will enable advancements in touch perception research and allow researchers to either combine its various sensing technologies or isolate individual signals for in-depth analysis of each modality. Over time, we hope researchers will use this device to develop AI that can better understand and model the physical world, including the physicality of objects, human-object interaction, and contact physics. Digit 360 significantly surpasses previous sensors, detecting miniature changes in spatial details, and captures forces as small as 1 millinewton.


Our advanced finger-shaped multimodal tactile sensor Digit 360 (right) side-by-side with our previous generation tactile sensor Digit (left).

To achieve this, we developed a touch perception-specific optical system with a wide field of view consisting of over 8 million taxels for capturing omnidirectional deformations on the fingertip surface. Additionally, we equip the sensor with many sensing modalities, since each touch interaction with the environment has a unique profile produced by the mechanical, geometrical, and chemical properties of a surface to perceive vibrations, sense heat, and even smell odor. By leveraging such multimodal signals, Digit 360 will help scientists advance research into AI that can learn about the world in richer detail. With an on-device AI accelerator, Digit 360 can quickly process information locally to react to stimuli such as the flex of a tennis ball or the poke of a needle. It can act as a peripheral nervous system on a robot inspired by the reflex arc in humans and animals.

Beyond advancing robot dexterity, this breakthrough sensor has significant potential applications from medicine and prosthetics to virtual reality and telepresence. This new tactile-specific optical lens can see the imprints all around the artificial fingertip, capturing more sensitive details about the surface touching the object. For virtual worlds, Digit 360 can help better ground virtual interactions with the environment to more realistic representations of object properties beyond their visual appearances. We’re publicly releasing all code and designs and look forward to seeing the community iterate on this work.

Read the paper

Visit the website

Download the code and design

Meta Digit Plexus: A standardized platform for tactile sensing hands

The human hand is marvelous at signaling to the brain touch information across the skin from fingertips to palm. This enables actuating the muscles in the hand when making decisions, for instance about how to type on a keyboard or interact with an object that’s too hot. Achieving embodied AI requires similar coordination between the tactile sensing and motor actuation on a robot hand.



We present a standardized platform, Meta Digit Plexus, that provides a hardware-software solution to integrate tactile sensors on a single robot hand. The platform interfaces vision-based and skin-based tactile sensors like Digit, Digit 360, and ReSkin across fingertips, fingers, and palm into control boards to encode all data to a host computer. The software integration and hardware components of the platform allows for seamless data collection, control, and analysis over a single cable.

Building such a standardized platform from scratch enables us to push the state of the art in AI and robot dexterity research. Today, we’re sharing the code and design for Meta Digit Plexus to help lower the barriers to entry for the community to pursue touch perception and dexterity research.

Download the code and design

GelSight Inc and Wonik Robotics: Partners in pioneering the future of robotics

We believe collaboration across the industry is the best way to advance robotics for the greater good. We’re partnering with leaders in the industry, GelSight Inc and Wonik Robotics, to develop and provide access to robots equipped with the advancements we’re sharing today.

GelSight Inc will manufacture and distribute Digit 360, targeting wide availability next year. This will help foster a community-driven approach to robotics research. Members of the research community can apply through the Digit 360 call for proposals to gain early access.

“Partnering with Meta on Digit 360 came from an immediate agreement on vision,” says Youssef Benmokhtar, Chief Executive Officer of GelSight Inc. “We want to encourage researchers and developers to embrace this technology in their research and make tactile sensing ubiquitous.”

We’re also collaborating with Wonik Robotics, a South Korean robotics company, to develop the Allegro Hand, a fully integrated robotic hand with tactile sensors. Building on the Meta Digit Plexus platform, the next generation of Allegro Hand is poised to help advance robotics research by making it easier for researchers to conduct experiments. Wonik Robotics will manufacture and distribute the Allegro Hand, which will be made available next year. Community members who want to stay up to date on the release are encouraged to fill out an interest form.

“Wonik Robotics and Meta FAIR aim to introduce robotic hands to global companies, research institutes, and universities so they can continue developing robotic hand technology that is safe and helpful to humankind,” says Dr. Yonmook Park, Executive Director and the Head of Future Technology Headquarters at Wonik Robotics.

PARTNR: A new benchmark for human-robot collaboration

As we move closer to a future with intelligent robots and advanced AI models capable of performing everyday household chores, it’s important to consider their interaction with humans. That’s why we’re releasing a benchmark for Planning And Reasoning Tasks in humaN-Robot collaboration (PARTNR), designed to study human-robot collaboration in household activities. Training and testing social embodied agents on physical hardware with actual human partners is hard to scale and may pose safety concerns. We address this by developing PARTNR on top of Habitat 3.0, a high-speed, realistic simulator that supports both robots and humanoid avatars and allows for human-robot collaboration in home-like environments, with the future goal of testing in physical-world scenarios.

PARTNR stands as the largest benchmark of its kind, comprising 100,000 natural language tasks, spanning 60 houses and over 5,800 unique objects. The benchmark is designed to evaluate the performance of large language and vision models (LLMs/VLMs) in collaborating with humans through a human-in-the-loop tool. It comes with several state-of-the-art LLM baselines and enables systematic analysis across the axes of planning, perception, and skill execution. Our results show that state-of-the-art LLM-based planners struggle with coordination, task tracking, and failure recovery.

The journey of transforming AI models from agents to adept partners is ongoing. By providing a standardized benchmark and dataset, PARTNR aims to propel responsible research and innovation in the field of human-robot collaboration. We hope it enables research into robots that can not only operate in isolation, but also around people, making them more efficient, reliable, and adaptable to each person’s preferences.

Read the paper

Visit the website

Download the code

Looking to the future

Expanding capabilities in touch perception and robotics will be a gamechanger for the open source community, helping enable new possibilities in medical research, supply chains, manufacturing, energy, and more. We continue to be committed to publicly releasing models, datasets, and software, and we also believe sharing hardware platforms will foster new generations of robotics AI research. Through our partnerships with GelSight Inc and Wonik Robotics, we’re excited to get this hardware out to researchers so they can iterate on this technology and explore exciting new use cases. Iterating together with the community will bring us all closer to a future where AI and robotics can serve the greater good.



Share:

Our latest updates delivered to your inbox

Subscribe to our newsletter to keep up with Meta AI news, events, research breakthroughs, and more.

Join us in the pursuit of what’s possible with AI.

Related Posts
Computer Vision
Introducing Segment Anything: Working toward the first foundation model for image segmentation
April 5, 2023
FEATURED
Research
MultiRay: Optimizing efficiency for large-scale AI models
November 18, 2022
FEATURED
ML Applications
MuAViC: The first audio-video speech translation benchmark
March 8, 2023