Providing chatbots with guideline context significantly improves their imaging recs

With proper prompts and appropriate context, chatbots can play a valuable role in imaging referrals. 

Using specialized information on the American College of Radiology’s imaging guidelines, experts were recently able to fine-tune OpenAI’s GPT-4 so that it provided exam recommendations in line with those delivered by human medical professionals. This bigstep toward greater reliability was achieved, in part, through zero-shot learning, authors of a new paper describing the chatbot’s enhancements explained. 

“Zero-shot learning refers to the ability of a model to make accurate predictions on tasks it has not been explicitly trained on by leveraging generalized knowledge and integrated textual information during prompting. This approach is also known as retrieval-augmented generation,” corresponding author Alexander Rau, with the Department of Diagnostic and Interventional Radiology at the University of Freiburg in Germany, and co-authors noted. “This appropriateness criteria context-aware GPT (accGPT) surpassed generic chatbots and general radiologists in applying the ACR appropriateness criteria to clinical referral notes.” 

For the study, the team refined GPT-3.5-Turbo by incorporating specialized knowledge of the ACR guidelines. Researchers then upgraded the chatbot to GPT-4 and developed an enhanced prompting strategy to test the LLM’s ability to apply ACR appropriateness guidelines to prompts related to imaging referrals based on clinical notes. 

The context-aware chatbot outperformed the generic versions of GPT-3.5-Turbo and GPT-4 in providing “usually or may be appropriate” recommendations based on ACR’s guidelines. Its performance was also superior to that of human radiologists, the group noted.  

What's more, it surpassed GPT-3.5-Turbo and general radiologists in providing “usually appropriate” recommendations, and appropriately identified cases when no imaging was needed—something that requires “a profound understanding of clinical contexts and guidelines,” and that was a difficult task for the other chatbots. 

Its recommendations were consistently accurate, indicating potential for future use in imaging referral guidance, the group suggested. 

“Higher consistency is crucial for clinical decision-making as it ensures reliability and reduces variability in diagnostic recommendations. Future research might investigate the performance of other LLM in this regard, as initial results in radiological decision support are promising,” the authors wrote. 

The group suggested that, alongside the links and references the chatbot provided with its answers, the contextual adjustments could improve trust in its outputs. This also could allow for more individualized diagnostic workups and greater reliability in chatbots' recommendations, the team indicated. 

Learn more about the research here. 

Hannah murhphy headshot

In addition to her background in journalism, Hannah also has patient-facing experience in clinical settings, having spent more than 12 years working as a registered rad tech. She joined Innovate Healthcare in 2021 and has since put her unique expertise to use in her editorial role with Health Imaging.

Around the web

As cardiac point-of-care ultrasound use continues to grow outside of traditional echo labs, the American Society of Echocardiography is working to ensure everyone is on the same page.

In the post-COVID era, wages for permanent RNs are rising, and wages for travelers are decreasing. A new report tracked these trends and more. 

Debra L. Monticciolo, MD, past president of both the Society of Breast Imaging and the American College of Radiology, explains the advantages and disadvantages of current breast screening technology.

Trimed Popup
Trimed Popup