Meta's new large language model excels at board-style radiology prompts

Hannah Murphy | August 13, 2024 | Health Imaging | Imaging Informatics

Meta Llama 3—an open-source large language model—may soon be giving other LLMs a run for their money in the medical field, according to new data published in the journal Radiology.

The LLM recently performed on par with larger proprietary models, like GPT-4, on a set of board style radiology questions. Though proprietary models have shown great promise within radiology, they require data be sent outside of hospital settings, which raises privacy concerns. That, combined with their sometimes inconsistent performance after updates, limits users’ trust in the models’ reliability.

Meta’s performance highlights the potential for open-source models in addressing some of these limitations, authors of the new paper suggest.

“The development of open-source LLMs offers a solution that allows for local operation within hospitals, improving privacy and stability when the LLM system is not maintained and manually updated by staff,” corresponding author Lisa C. Adams, from the Department of Diagnostic and Interventional Radiology at Technical University Munich, and colleagues note.

Proprietary models typically have outperformed open-source LLMs, but Meta AI’s latest model has 70 billion parameters, positioning it well to compete with other more established models. To test this theory, experts prompted Meta Llama 3 and several other LLMs—OpenAI’s GPT-4 Turbo and GPT-3.5 Turbo, Anthropic’s Claude 3 Opus and Google DeepMind’s Gemini Ultra—to answer 50 questions from the American College of Radiology’s 2022 in-training test, in addition to 85 new board-style questions not previously used to train other LLMs.

The models’ performances varied widely. For the 50 ACR test questions, Llama 3 performed the best out of all the open-source models, with 74% accuracy. Its performance was in line with both GPT-4 Turbo and Claude 3 Opus, both of which achieved 78% accuracy.

However, Llama 3 outperformed these models on the additional board-style questions, answering 68 out of 85 correctly. In comparison, GPT-3.5 achieved 61 correct answers.

“This demonstrates the growing capabilities of open-source LLMs, which offer privacy, customization, and reliability comparable to that of their proprietary counterparts, but with far fewer parameters, potentially lowering operating costs when using optimization techniques such as quantization,” the group suggests, adding that planned expansions of Llama 3 in the near future are encouraging.

“The growing maturity and competitiveness of open-source models make them promising candidates for future research and application in radiology.”

The study abstract is available here.

Hannah Murphy

In addition to her background in journalism, Hannah also has patient-facing experience in clinical settings, having spent more than 12 years working as a registered rad tech. She began covering the medical imaging industry for Innovate Healthcare in 2021.

Around the web

Radiology Business

The impact of Trump tariffs on iodine contrast media costs

GE HealthCare said the price of iodine contrast increased by more than 200% between 2017 to 2023. Will new Chinese tariffs drive costs even higher?

Cardiovascular Business

COVID-19 linked to accelerated plaque growth, long-term risk of heart attack or stroke

These risks appear to be present regardless of a person's age or health at the time of infection.

Radiology Business

Top performing PACS companies based on user feedback

Agfa and Sectra both performed well with end-user satisfaction scores in the 2025 Best in KLAS list of radiology IT systems.

Meta's new large language model excels at board-style radiology prompts

Related Content

Around the web