We are innovating in the open,
for a smarter, more connected world
We are innovating in the open, for a smarter, more connected world

Segment Anything 3
With SAM 3, you can use text and visual prompts to precisely detect, segment and track any object in any image or video.

DINOv3
DINOv3 scales self-supervised learning (SSL) for images to produce our strongest universal vision backbones, enabling breakthrough performance across diverse domains.
DINOv3 scales self-supervised learning to train powerful, versatile model
V-JEPA 2
The first world model trained on video that achieves state-of-the-art visual understanding and prediction.
Video Joint Embedding Predictive Architecture 2 (V-JEPA 2) is a self-supervised foundation world model
Seamless Interaction
Advancing AI research modeling of face-to-face dynamics, including expressive gestures, active listening, turn-taking and visual synchrony.
Audiovisual motion models compatible with 2D and 3D renderings, trained on the Seamless Interaction Dataset
Segment Anything 2
SAM 2 is a segmentation model that enables fast, precise selection of any object in any video or image.
Advanced capabilities in object detection, segmentation and tracking
More from Meta's FAIR Team
Meta Motivo
Movie Gen
Audiobox
Seamless Communication
AI Chemistry
Try experimental demos
How Meta is applying cutting-edge AI research to real-world interactions


Create video cutouts and effects with a few clicks
For researchers and developers
Meta FAIR is advancing research and delivering breakthroughs in a variety of areas.01.
Communication & Language
01.
Communication & Language
We advance AI capabilities in expressive communication, social interaction and use of language. Through foundational research in natural language processing and multimodal AI, we develop systems that enable more natural, meaningful interactions between humans and machines.
02.
Embodiment & Actions
02.
Embodiment & Actions
We advance the fundamental capabilities needed for AI to understand and act within the physical and digital world. Through our research, we hope to unlock a wide variety of future agents that help humans do more throughout all aspects of their lives. From robots that can move around, interact with objects, to help accomplish household tasks, to wearable glasses that understand the real and digital world and support people throughout their daily tasks.
03.
Alignment
03.
Alignment
Our research focuses on aligning models and decisions with human intent and societal interests through deeper fundamental understanding and enhanced steerability and efficiency of AI models. The pillar is at the forefront of research on AI for science and AI for society.
04.
Core Learning & Reasoning
04.
Core Learning & Reasoning
We conduct fundamental research in pre-training methods and new architectural paradigms that enable foundation models to learn and reason with agility and efficiency across novel downstream challenges. Our work expands the frontier of approaches such as world models, non-autoregressive architectures, and memory-augmented models to unlock new capabilities in adaptive intelligence.
05.
Coding
05.
Coding
We develop code world models as foundational models for code and agents, and advance methods to do reinforcement learning with execution feedback. We research how to do much more efficient architectures for code world models, latent space reasoning, and grounded reasoning and planning with world models. We develop various agents, e.g. AI research agents to help our own research, and upstream our agents’ needs to our foundational models.
06.
Perception
06.
Perception
The north star goal of our Perception research teams is to enable general AI systems to perceive the visual world to inform action, communication, and generation. To achieve this goal, we're developing next generation perception models capable of understanding images and videos not as pixels, but as a capture of visual entities like people, objects, activities, their spatial and temporal relationships.