Large Language Model
The Llama Ecosystem: Past, Present, and Future
September 27, 2023
3 minute read

It’s been roughly seven months since we released Llama 1 and only a few months since Llama 2 was introduced, followed by the release of Code Llama. In short, the response from the community has been staggering. We’ve seen a lot of momentum and innovation, with more than 30 million downloads of Llama-based models through Hugging Face and over 10 million of these in the last 30 days alone. Much like PyTorch, Llama has evolved into a platform for the world to build on, and we couldn’t be more excited.

Impact to Date

Several remarkable developments highlight the growth of the Llama community:

  • Cloud usage: Major platforms such as AWS, Google Cloud, and Microsoft Azure have embraced Llama models on their platforms, and Llama 2’s presence in the cloud is expanding. Today we announced AWS as our first managed API partner for Llama 2. Now organizations of all sizes can access Llama 2 models on Amazon Bedrock without having to manage the underlying infrastructure. This is a step change in accessibility. Furthermore, to date, end usage has been incredible with Google Cloud and AWS together seeing more than 3,500 enterprise project starts based on Llama 2 models.
  • Innovators: Innovators and startups are making Llama the foundation for their generative AI product innovation. Tens of thousands of startups are using or evaluating Llama 2 including Anyscale, Replicate, Snowflake, LangSmith, Scale AI, and so many others. And innovators like DoorDash are using it to experiment at scale ahead of releasing new LLM-powered features.
  • Crowd sourced optimization: The open source community has really embraced our models. To date, the community has fine-tuned and released over 7,000 derivatives on Hugging Face. On average, across standard benchmarks, these have improved performance on common benchmarks by nearly 10% with remarkable improvements of up to 46% for benchmark datasets like TruthQA.
  • Developer community: There are now over 7,000 projects on GitHub built on or mentioning Llama. New tools, deployment libraries, methods for model evaluation, and even “tiny” versions of Llama are being developed to bring Llama to edge devices and mobile platforms. Additionally, the community has expanded Llama to support larger context windows, added support for additional languages, and so much more.
  • Hardware support: The hardware community has fully embraced Llama as a key model architecture. Major hardware platforms AMD, Intel, Nvidia, and Google have boosted the performance of Llama 2 through hardware and software optimizations.

The ecosystem is vibrant with participants at every layer of the stack, from server and mobile hardware to cloud platforms, startups, and enterprises.

With the most recent Code Llama release, these models became available on many of these platforms within hours, creating an incredible level of velocity for the community.

It Began as a Fast-Moving Research Project...

Over the last few years, large language models (LLMs) — natural language processing (NLP) systems with billions of parameters — have demonstrated new capabilities such as generating creative text, solving mathematical theorems, predicting protein structures, answering reading comprehension questions, and more. These projects represent clear examples of the significant potential benefits AI can offer to billions of people at scale.

The original project, LLaMA or Llama 1 as we’ve denoted most recently, was developed in FAIR by a team mainly focused on formal mathematics but in parallel saw the power of LLMs and how a relatively smaller model trained with the right scaling laws and highly curated data could be a powerful foundation for new applications in research. And hence the first generation of Llama was born and has since sparked innovation across academia and the world. In fact, within a matter of days, researchers in various academic institutions were able to tune much improved versions of Llama 1 that could follow instructions or handle additional tasks. And from there the community started to innovate in a number of ways and directions.

But we wanted to make the technology available more broadly. This is where Llama 2 came in.

Why Did We Release Our Models?

As our history shows, we believe deeply in the power of the open source community. We believe that state-of-the-art AI technology is safer and better aligned when it’s open and accessible to everyone.

Additionally, where there are areas of high entropy, it’s advantageous to build bridges and leverage the innovation that inevitably arises. This was true for PyTorch, where breakthroughs like Stable Diffusion, GPT 3, and GPT 4 continually disrupted the world of AI, and it’s true for Llama as well. For us at Meta, we can summarize the value back along three axes:

Research: New techniques, performance optimizations, tools, and evaluation methods, including work on safety, provide Meta leverage from the research community to more quickly incorporate learnings. Many of these communities are also nascent, and collaborating in the open makes it much easier to make progress;

Enterprise and commercialization: The more enterprises and startups build on our technology, the more we can learn about use cases, safe model deployment, and potential opportunities; and

Developer ecosystem: LLMs have fundamentally changed AI development, and new tools and approaches are emerging daily for manipulating, managing, and evaluating models. Having a lingua franca to the community enables us to quickly leverage these technologies, accelerating our internal stack.

But this isn’t new for Meta. Just as with PyTorch and dozens of other publicly released or open source projects, this philosophy is deeply ingrained in our company’s DNA.

The Path Forward

One thing is for certain: The generative AI space moves rapidly, and we’re all learning together about the capabilities and applications of this technology. Meta remains committed to an open approach for today’s AI. Here are a few of the areas of focus for us as we continue on this journey together:

Multimodal: Just as the world isn’t made up entirely of text, AI can embrace new modalities to enable even more immersive generative experiences;

Safety and responsibility: Generative AI has revitalized the world of responsible AI. We will place even greater emphasis on safety and responsibility, developing new tools, building partnerships, and utilizing Llama as a vehicle for our community to continue to learn about how to build safely and responsibly; and

A focus on community: Much like PyTorch, we see this as a community of developers that have a voice, and we want to give them agency and a vehicle to further their innovation. We aim to provide new ways for the community to showcase work, contribute, and tell their stories.

Want to Learn More About the Llama Family?

During the Meta Connect keynote, we talked a lot about our Llama models and the future of open access. From our sessions to hands-on workshops, we’re excited to share our latest developments with you.

Here are some ways you can dive deeper and learn more:

  1. Download the model and interact with Llama 2.
  2. Attend the Connect Sessions including our workshops on building with Llama models.
  3. Visit ai.meta.com/llama to read the paper, review our responsible use guide and acceptable use policy, and learn more about the partners that help support the Llama ecosystem.

Written by:
Joe Spisak
Product Director
Sergey Edunov
Engineering Director

Our latest updates delivered to your inbox

Subscribe to our newsletter to keep up with Meta AI news, events, research breakthroughs, and more.

Join us in the pursuit of what’s possible with AI.

Related Posts
Computer Vision
Introducing Segment Anything: Working toward the first foundation model for image segmentation
April 5, 2023
MultiRay: Optimizing efficiency for large-scale AI models
November 18, 2022
ML Applications
MuAViC: The first audio-video speech translation benchmark
March 8, 2023