November 1, 2021
Touch is important to the way people experience the world. We typically think of touch as a way to convey warmth and care, but it’s also a key sensing modality for perceiving the world around us. Touching provides us with information that’s not discernible through any other sense, for example, about the temperature of a matter, its texture and weight, and sometimes even its state. We want AI to be able to learn from and interact with the world through touch, like humans do. But for that to happen, we need to foster an ecosystem for tactile sensing in robotics. Today, we’re outlining our progress in developing hardware, simulators, libraries, benchmarks, and data sets — pillars of the touch-sensing ecosystem necessary for building AI systems that can understand and interact through touch.
Together, these pillars will allow us to train robots and AI to draw critical information from touch, and to integrate this information with others to accomplish tasks of greater complexity and superior functionality.
Tactile sensing is an emerging field in robotics that aims to understand and replicate human-level touch in the physical world with the goal of making robots more efficient in interacting with the world around us. Advancements in tactile sensing will lead to AI that can learn from and use touch on its own as well as in conjunction with other sensing modalities such as vision and audio, much like people do. Additionally, advancing the sense of touch for robots will enable them to be more capable, as well as gentler and safer.
To enable AI to use and learn from tactile data, we first need sensors that can collect and enable the processing of that data. Ideally, touch-sensing hardware should model many of the attributes of the human finger. For one, the right sensors for robot fingertips should be relatively compact. This requires advanced miniaturization techniques that are quite costly to produce and often beyond the reach of a majority of the academic research community. Then, they should be able to withstand wear and tear from repeated contact with surfaces. Touch sensors also need to be of high resolution, with the ability to measure rich information about the object being touched, such as surface features, contact forces, and other object properties discernible through contact.
To deliver an easy-to-build, reliable, low-cost, compact, high-resolution tactile sensor designed for robotic in-hand manipulation, we released the full open source design of DIGIT in 2020. Compared with currently available commercial tactile sensors, DIGIT is significantly cheaper to manufacture and provides hundreds of thousands more contact points, which makes it more useful and accessible to research teams around the world.
In partnership with Meta AI, GelSight, an MIT spin-off with unique digital tactile-sensing technology and products, will now commercially manufacture DIGIT to make it easier for researchers like Boris to carry out touch-sensing research. Imagine if CV researchers had to either manufacture their own cameras or buy expensive professional cameras. Progress in the field would depend on researchers having the funds to procure expensive cameras or the ability to manufacture their own. It could mean a lack of standardization, which could make research across institutions and teams difficult to replicate. With commercially available cameras, CV researchers do not have to worry about procuring hardware for their research and can more easily build on the work of others. In the same manner, commercially available DIGITs for touch sensing could open up the field of touch sensing to even more researchers — and more rapid advancement.
In addition to DIGIT, Meta AI researchers, in collaboration with Carnegie Mellon University, have also developed ReSkin, an open source touch-sensing “skin” that has a low form factor and can help robots and other machines learn high-frequency tactile sensing over larger surfaces.
We developed and open-sourced TACTO, a simulator for high-resolution vision-based tactile sensors to enable a faster experimentation platform and to support ML research even in the absence of hardware. Simulators play an important role in prototyping, debugging, and benchmarking new advances in robotics because they allow us to test and validate hypotheses without the need to perform experiments that would be time-consuming in the real world. In addition to the benefit of being able to run faster experiments in simulation, challenges with getting the right hardware as well as reducing wear and tear on hardware surfaces make simulations even more important with touch sensing.
TACTO can render realistic high-resolution touch readings at hundreds of frames per second, and can be easily configured to simulate different vision-based tactile sensors, including DIGIT and OmniTact. TACTO enables researchers to simulate vision-based tactile sensors with different form factors that can be mounted on different robots. “TACTO and DIGIT have democratized access to vision-based tactile sensing by providing low-cost reference implementations which enabled me to quickly prototype multimodal robot manipulation policies,” says Oier Mees, a doctoral research assistant in autonomous intelligent systems at the Institute of Computer Science at the University of Freiburg, in Germany.
Touch sensors like DIGIT deliver high-dimensional and rich touch-sensing data that is difficult to process with traditional analytical approaches. The use of machine learning (ML) models can significantly simplify the design and implementation of models that can translate raw sensor readings into high-level properties (e.g., detecting slip and recognizing materials). But training a model to process touch data is extremely challenging without a background in ML. To increase the reuse of code and reduce deployment time, we created a library of ML models and functionality for touch sensing called PyTouch.
Researchers can train and deploy models across different sensors using PyTouch. It currently provides basic functionalities such as detecting touch, slip, and estimating object pose. Ultimately, PyTouch will be integrated with both real-world sensors and our tactile-sensing simulator to enable fast validation of models as well as Sim2Real capabilities, which is the ability to transfer concepts trained in simulation to real-world applications.
PyTouch will also enable the wider robotics community to use advanced ML models dedicated to touch sensing “as a service,” where researchers can simply connect their DIGIT, download a pretrained model, and use this as a building block in their robotic application. For example, to build a controller that grasps objects, researchers could detect whether the controller’s fingers are in contact by downloading a module from PyTouch.
PyTouch enables a learning-based approach to building applications that will encourage the increased use of touch-processing features, since researchers no longer have to build capabilities via hard coding or re-creating tasks for each experiment. Libraries of pretrained models, such as OpenCV and Detectron2, have enabled researchers and developers in modalities such as computer vision to leverage state-of-the-art AI in their work without having to create and train models from scratch for each application. With PyTouch, our goal is to similarly empower a broader research community to leverage touch in their applications.
The availability of tactile sensors and simulators has now paved the way for meaningful metrics and benchmarks at multiple levels. At the hardware level, there are various benchmarks and data sets that can now be used to evaluate design choices in sensors. At the perceptual level, benchmarks can be used to compare different ML models for efficacy in different touch-sensing use cases. And at the robot control level, it is now possible to benchmark the benefits of touch in active control tasks such as in-hand manipulation, both in simulation and in the real world. Despite our progress in enabling systematic measurements, we still need to carefully investigate these different levels and the interplay between them as we work toward defining and releasing metrics and benchmarks that can guide the wider community toward more measurable progress.
We believe that advancing touch sensing will help AI and robotics researchers build stronger AI and more capable robots, so we continue to foster this ecosystem along the four pillars outlined above. Previously, we open-sourced DIGIT and TACTO and released PyTouch to enable and strengthen the scientific communities interested in sensor design, robotics, ML, neuroscience, and other aspects of touch sensing.
We are also currently studying Sim2Real transfer for training PyTouch models in simulation and deploying them on real sensors as a way to quickly collect data sets and train models. Collecting large-scale data sets containing large amounts of data can happen in minutes in simulation, whereas collecting data with a real sensor requires time and a person to physically probe objects. Finally, we plan to explore Real2Sim methods to better tune the simulator from real-world data.
Despite the progress we’ve made so far, we are only just beginning to explore touch as a sensing modality and to provide robots with touch capabilities that will allow them to better interact with the environment around them. We need more hardware improvements (including other sensor modalities that can detect temperature of touched objects, for instance), a better understanding of which touch features are used for specific tasks, a deeper understanding of the right ML computational structures for processing touch, standardized hardware and software, widely accepted benchmarks, and convincing demonstrations of previously unrealizable tasks made possible by touch sensing.
Improvements in touch sensing can help us advance AI and will enable researchers to build robots with enhanced functionalities and capabilities. It can also unlock possibilities in AR/VR, as well as lead to innovations in industrial, medical, and agricultural robotics. We’re working toward a future where every single robot may come equipped with touch-sensing capabilities.