AI research by Meta
A significant step towards removing language barriers through expressive, fast and high-quality AI translation
Download the modelsA family of AI research models that enable more natural and authentic communication across languages
The Seamless Communication models
SeamlessExpressive
A model that aims to preserve expression and intricacies of speech across languages.
SeamlessStreaming
A model that can deliver speech and text translations with around two seconds of latency.
SeamlessM4T v2
A foundational multilingual and multitask model that allows people to communicate effortlessly through speech and text.
Seamless
A model that merges capabilities from SeamlessExpressive, SeamlessStreaming and SeamlessM4T v2 into one.
Preserving prosody
SeamlessExpressive
Translations should capture the nuances of human expression. While existing translation tools are skilled at capturing the content within a conversation, they typically rely on monotone, robotic text-to-speech systems for their output. SeamlessExpressive aims to preserve intricacies of speech; such as pauses and speech rate, in addition to vocal style and emotional tone.
English input: whisper
Please keep the volume down. We just put the baby to sleep.
Spanish output: non-expressive
Spanish output: expressive
English input: sad
Please, don't leave. I hate being here alone.
French output: non-expressive
French output: expressive
Near real-time translation
SeamlessStreaming
SeamlessStreaming is the first massively multilingual model that delivers translations with around two-seconds of latency and nearly the same accuracy as an offline model. Built upon SeamlessM4T v2, SeamlessStreaming supports automatic speech recognition and speech-to-text translation for nearly 100 input and output languages, in addition to speech-to-speech translation for nearly 100 input languages and 36 output languages.
Learn moreFoundational model for universal translation
SeamlessM4T v2
In August 2023, we introduced the first version of SeamlessM4T, a foundational multilingual and multitask model that delivered state-of-the-art results for translation and transcription across speech and text. Built upon this work, our improved model, SeamlessM4T v2, serves as the foundation for our new SeamlessExpressive and SeamlessStreaming models. It features a new architecture with a non-autoregressive text to unit decoder that delivers improved consistency between text and speech output.
Learn moreMore model details
Learn more about the research behind Seamless Communication
Technical overviewTry the SeamlessExpressive demo
Try the SeamlessExpressive demo to hear how you sound in a different language while maintaining elements of your expression and tone.
SeamlessExpressive demoOur approach to research
Open innovation
We believe in the power of collaboration and open research to break down communication barriers. To enable our fellow researchers to build upon this work, we’re publicly releasing the full suite of Seamless Communication models, along with metadata, data and tools.
Safety and responsibility
We’re dedicated to promoting a safe and responsible AI ecosystem. We have taken a number of steps to improve the safety of our Seamless Communication models; significantly reducing the impacts of hallucinated toxicity in translations, and implementing a custom watermarking approach for audio outputs from our expressive models.
Foundational models
Latest news
Foundational models