Meta's new large language model excels at board-style radiology prompts

Meta Llama 3—an open-source large language model—may soon be giving other LLMs a run for their money in the medical field, according to new data published in the journal Radiology

The LLM recently performed on par with larger proprietary models, like GPT-4, on a set of board style radiology questions. Though proprietary models have shown great promise within radiology, they require data be sent outside of hospital settings, which raises privacy concerns. That, combined with their sometimes inconsistent performance after updates, limits users’ trust in the models’ reliability. 

Meta’s performance highlights the potential for open-source models in addressing some of these limitations, authors of the new paper suggest. 

“The development of open-source LLMs offers a solution that allows for local operation within hospitals, improving privacy and stability when the LLM system is not maintained and manually updated by staff,” corresponding author Lisa C. Adams, from the Department of Diagnostic and Interventional Radiology at Technical University Munich, and colleagues note. 

Proprietary models typically have outperformed open-source LLMs, but Meta AI’s latest model has 70 billion parameters, positioning it well to compete with other more established models. To test this theory, experts prompted Meta Llama 3 and several other LLMs—OpenAI’s GPT-4 Turbo and GPT-3.5 Turbo, Anthropic’s Claude 3 Opus and Google DeepMind’s Gemini Ultra—to answer 50 questions from the American College of Radiology’s 2022 in-training test, in addition to 85 new board-style questions not previously used to train other LLMs. 

The models’ performances varied widely. For the 50 ACR test questions, Llama 3 performed the best out of all the open-source models, with 74% accuracy. Its performance was in line with both GPT-4 Turbo and Claude 3 Opus, both of which achieved 78% accuracy. 

However, Llama 3 outperformed these models on the additional board-style questions, answering 68 out of 85 correctly. In comparison, GPT-3.5 achieved 61 correct answers. 

“This demonstrates the growing capabilities of open-source LLMs, which offer privacy, customization, and reliability comparable to that of their proprietary counterparts, but with far fewer parameters, potentially lowering operating costs when using optimization techniques such as quantization,” the group suggests, adding that planned expansions of Llama 3 in the near future are encouraging. 

“The growing maturity and competitiveness of open-source models make them promising candidates for future research and application in radiology.” 

The study abstract is available here

Hannah murhphy headshot

In addition to her background in journalism, Hannah also has patient-facing experience in clinical settings, having spent more than 12 years working as a registered rad tech. She joined Innovate Healthcare in 2021 and has since put her unique expertise to use in her editorial role with Health Imaging.

Around the web

The nuclear imaging isotope shortage of molybdenum-99 may be over now that the sidelined reactor is restarting. ASNC's president says PET and new SPECT technologies helped cardiac imaging labs better weather the storm.

CMS has more than doubled the CCTA payment rate from $175 to $357.13. The move, expected to have a significant impact on the utilization of cardiac CT, received immediate praise from imaging specialists.

The newly cleared offering, AutoChamber, was designed with opportunistic screening in mind. It can evaluate many different kinds of CT images, including those originally gathered to screen patients for lung cancer. 

Trimed Popup
Trimed Popup