Providing chatbots with guideline context significantly improves their imaging recs

With proper prompts and appropriate context, chatbots can play a valuable role in imaging referrals. 

Using specialized information on the American College of Radiology’s imaging guidelines, experts were recently able to fine-tune OpenAI’s GPT-4 so that it provided exam recommendations in line with those delivered by human medical professionals. This bigstep toward greater reliability was achieved, in part, through zero-shot learning, authors of a new paper describing the chatbot’s enhancements explained. 

“Zero-shot learning refers to the ability of a model to make accurate predictions on tasks it has not been explicitly trained on by leveraging generalized knowledge and integrated textual information during prompting. This approach is also known as retrieval-augmented generation,” corresponding author Alexander Rau, with the Department of Diagnostic and Interventional Radiology at the University of Freiburg in Germany, and co-authors noted. “This appropriateness criteria context-aware GPT (accGPT) surpassed generic chatbots and general radiologists in applying the ACR appropriateness criteria to clinical referral notes.” 

For the study, the team refined GPT-3.5-Turbo by incorporating specialized knowledge of the ACR guidelines. Researchers then upgraded the chatbot to GPT-4 and developed an enhanced prompting strategy to test the LLM’s ability to apply ACR appropriateness guidelines to prompts related to imaging referrals based on clinical notes. 

The context-aware chatbot outperformed the generic versions of GPT-3.5-Turbo and GPT-4 in providing “usually or may be appropriate” recommendations based on ACR’s guidelines. Its performance was also superior to that of human radiologists, the group noted.  

What's more, it surpassed GPT-3.5-Turbo and general radiologists in providing “usually appropriate” recommendations, and appropriately identified cases when no imaging was needed—something that requires “a profound understanding of clinical contexts and guidelines,” and that was a difficult task for the other chatbots. 

Its recommendations were consistently accurate, indicating potential for future use in imaging referral guidance, the group suggested. 

“Higher consistency is crucial for clinical decision-making as it ensures reliability and reduces variability in diagnostic recommendations. Future research might investigate the performance of other LLM in this regard, as initial results in radiological decision support are promising,” the authors wrote. 

The group suggested that, alongside the links and references the chatbot provided with its answers, the contextual adjustments could improve trust in its outputs. This also could allow for more individualized diagnostic workups and greater reliability in chatbots' recommendations, the team indicated. 

Learn more about the research here. 

Hannah murhphy headshot

In addition to her background in journalism, Hannah also has patient-facing experience in clinical settings, having spent more than 12 years working as a registered rad tech. She began covering the medical imaging industry for Innovate Healthcare in 2021.

Around the web

GE HealthCare designed the new-look Revolution Vibe CT scanner to help hospitals and health systems embrace CCTA and improve overall efficiency.

Clinicians have been using HeartSee to diagnose and treat coronary artery disease since the technology first debuted back in 2018. These latest updates, set to roll out to existing users, are designed to improve diagnostic performance and user access.

The cardiac technologies clinicians use for CVD evaluations have changed significantly in recent years, according to a new analysis of CMS data. While some modalities are on the rise, others are being utilized much less than ever before.