In medicine, access to the right information at the right time in the right place can determine everything—from deciding to take preventative measures to seeking care when needed, receiving a timely diagnosis, and taking medications as prescribed.
Meditron, a suite of open-source large multimodal foundation models tailored to the medical field and designed to assist with clinical decision-making and diagnosis, was built on Meta Llama 2 and trained on carefully curated, high-quality medical data sources with continual input from clinicians and experts in humanitarian response.
Researchers at EPFL’s School of Computer and Communication Sciences and Yale School of Medicine teamed up on the project, working closely with humanitarian organizations like the International Committee of the Red Cross (ICRC). Meditron has been downloaded over 30,000 times within its first months of release, filling an important gap in innovation in low-resource medical settings. And following last week’s release of Meta Llama 3, the team fine-tuned the new 8B model within 24 hours to deliver Llama-3[8B]-MeditronV1.0, which outperforms all state-of-the-art open models within its parameter class on standard benchmarks such as MedQA and MedMCQA.
“Foundation models have become modern-day intellectual and cultural assets,” says Yale professor Mary-Anne Hartley, who is co-leading the project. “When applied to the medical domain, they have the potential to provide life-saving advice and guidance. Yet the lowest-resource settings have the most to gain and remain the least represented.”
LLMs like Llama can compress complex information into an accessible conversational interface. Meditron adapted Llama 2 to ensure that the information provided better aligns with evidence-based care, contextually aware recommendations, and professional standards. The Meditron suite has the potential to serve crucial needs in a variety of settings, including emergency scenarios requiring fast and accurate medical response and assisting healthcare workers in diagnosing and treating patients in underserved areas.
The hope, says Hartley, is that releasing it fully open-access and open-source—from data to weights, with clear getting-started documentation—can empower innovation in resource-constrained settings to better ensure representation and create equitable access to medical knowledge.
“Low-resource settings should not be forced to ‘reinvent the wheel’ in order to have their populations and needs represented in this critical technology,” Hartley says.
When funding is an issue, start small and focus on quality
Funding can be a major challenge for anyone, but particularly for groups working in humanitarian and low-resource settings. Hartley says the team chose not to commercialize in order to maintain the neutrality required for impartial validation.
To conserve costs, experiments started on the smaller Llama 2 7B to narrow down optimal pre-training data mixtures and parameters for the scale-up to 70B. That conservative approach is also why the team released 7B and 70B Meditron models. While Meditron 7B is less performant, it is still very useful for modeling experimental scale-up, Hartley notes.
The multimodal implementation has followed a similar path. Meditron 7B integrates image interpretation, and while extremely promising (outperforming the 562B Medpalm M on medical image interpretation), it would be even better on 70B and deserves investment, Hartley says.
This focus on quality over quantity also meant the team spent most of its time carefully curating medically validated textual documents representing evidence-based guidelines in high- and low-resource settings. Continued pretraining, which updates all the parameters of the model rather than just focusing on a subset for fine-tuning, minimized the risk of contamination and bias from the general text corpus on which Llama was trained, Hartley says. It also maximized its retention of medical knowledge.
Because continued pretraining on a multi-GPU, multi-node cluster is very technically challenging, the team integrated the Llama architecture into a high-performing distributed trainer, Megatron-LM. Recognizing that this is an issue many others could also face, they made sure to open-source the adapted version of Megatron.
Putting Meditron to the test with open validation and evaluation
Hartley says that by far the most exciting real-world result from the Meditron work is the massive scale interest from medical professionals and humanitarian organizations across the world to participate in the Meditron MOOVE (Massive Online Open Validation and Evaluation).
Doctors from around the world, especially in low-resource settings, are asking Meditron challenging questions and critically evaluating its answers so the team can adapt it accordingly.
Meditron is currently the best-performing open-source LLM for medicine according to the leading benchmarks in the field, such as question-answering of biomedical exams, Hartley says. The team opted for a MOOVE to make the community aware that these benchmarks do not fully represent the real-world clinical practice of medicine or the challenges in low-resource settings and humanitarian response.
“That these time-constrained professionals are volunteering their time in our open-source community to independently validate Meditron is a recognition of its value,” Hartley says. “We are in a unique position to take all this feedback and incorporate it in a new model. We hope funders will recognize the social and commercial value of investing in our academic open-source initiative.”
Open-source technology has a time-tested history of empowering innovation and, critically, making it equitably accessible, Hartley says. “We are constantly hearing from researchers in low-resource settings about how Meditron has enabled their research,” she explains. “While open source is not new, the scale and cost of the contribution are. We need to be more audacious in seeking neutral philanthropic support for efforts like these.”
Our latest updates delivered to your inbox
Subscribe to our newsletter to keep up with Meta AI news, events, research breakthroughs, and more.
Join us in the pursuit of what’s possible with AI.
Foundational models
Latest news
Foundational models