Large Language Model

How Llama is helping Saama deliver new possibilities in personalized medicine and data-driven care

January 14, 2025
11 minute read

OpenBioLLM, a series of fine-tuned Llama models from life sciences company Saama, streamlines tasks that can accelerate clinical trials and potentially create new possibilities in personalized medicine. These models generate clinical trial protocols, clinical study reports, and other necessary documents to accelerate protocol generation and data analysis, bringing potentially life-saving treatments to patients sooner. They also improve diagnostic accuracy and treatment planning through efficient information processing, offering doctors and patients data-backed insights to help them make decisions about care.

“As an open source model, OpenBioLLM is accessible to researchers and healthcare providers worldwide, and the real-world impact has been significant,” says Malaikannan Sankarasubbu, Chief Technology & AI Officer at Saama.

The two Saama models—OpenBioLLM-8B and OpenBioLLM-70B—leverage Llama 3’s architecture to expedite the extraction of insights from clinical trial documents, data analysis, clinic protocol generation, and medical knowledge graph reasoning.

OpenBioLLM has been widely adopted in clinical development in biomedical and healthcare applications, facilitating research and analysis, data management, and operational efficiency. The models can aid in drug discovery and supporting genomics analysis. Other researchers are building on it for their own work, including a recent paper delivered at an Association of Computational Linguistics conference.

“The tangible impacts showcase how AI, specifically Llama-based models, can revolutionize healthcare and life sciences, potentially improving patient outcomes and saving lives,” Sankarasubbu says. “Our commitment to open source development has allowed us to share advancements with the broader scientific community, fostering collaboration and innovation in biomedical AI. These models are paving the way for highly personalized medical care.”

Building with Llama in complex use cases

Saama's use of Llama has evolved significantly, expanding to complex use cases such as protocol generation, medical knowledge graph reasoning, and clinical trial document analysis. The team developed specialized models for different medical subjects and significantly improved performance across biomedical tasks when it scaled its models to 8B and 70B parameters with the release of Llama 2 and 3. The company is currently exploring multimodal applications, integrating Llama-based models with medical imaging and genomics data.

To ensure privacy and compliance in the highly regulated healthcare environment, Saama developed advanced de-identification techniques and secure data handling protocols to adhere to healthcare regulations like HIPAA. Saama’s in-house AI researchers address any challenges that arise to ensure OpenBioLLM maintains its status as a state-of-the-art biomedical language model. The team employed rigorous testing protocols and collaborated with medical professionals to validate model outputs and mitigate biases.

When the team implemented Llama for biomedical applications, they built on the experience with MedMCQA, a dataset designed to address real-world medical entrance exam questions. A two-stage fine-tuning process involved several key steps, including curating a high-quality medical instruction dataset and creating a Direct Preference Optimization (DPO) dataset using medical expert evaluations. As a fine-tuning framework, the team adapted the Hugging Face Transformers library and TRL module for specific biomedical use cases.

“A comprehensive approach to fine-tuning enabled the team to create models that excel in biomedical tasks and outperform larger, proprietary models on specific benchmarks,” says Sankarasubbu. “The team used Llama 3 as the base model for both 8B and 70B parameter versions.”

The open source path to success

Saama opened its AI research lab in 2017, enabling collaborative innovation with talented developers and researchers worldwide. Sankarasubbu credits open source as fundamental to Saama’s success.

“Open source is poised to revolutionize biomedical AI, fostering a more inclusive, innovative ecosystem and democratizing access to advanced healthcare technologies,” Sankarasubbu says

Connections with universities have helped make Saama’s open source projects and papers both more practical and impactful. The company’s approach to knowledge sharing includes publishing research papers in top-tier conferences and open sourcing many GitHub projects, enabling it to contribute and benefit from the global knowledge pool. Major AI players have utilized its open source contributions, including datasets and benchmarks.

“The positive response and appreciation we've received have reinforced our commitment to an open research culture,” Sankarasubbu says. “Collaborative approaches lead to faster advancements, and aligning open source projects with medical guidelines ensures responsible innovation in healthcare AI.”

As the Llama ecosystem evolves, Saama anticipates expanding its use in regular model upgrades based on each new iteration of Llama, including Llama 3.1 and future releases.


Share:

Our latest updates delivered to your inbox

Subscribe to our newsletter to keep up with Meta AI news, events, research breakthroughs, and more.

Join us in the pursuit of what’s possible with AI.

Related Posts
Computer Vision
Introducing Segment Anything: Working toward the first foundation model for image segmentation
April 5, 2023
FEATURED
Research
MultiRay: Optimizing efficiency for large-scale AI models
November 18, 2022
FEATURED
ML Applications
MuAViC: The first audio-video speech translation benchmark
March 8, 2023