ChatGPT's radiology board success has experts rethinking resident education

Following ChatGPT’s strong performance on a mock radiology board exam, experts are calling for radiology training programs to rethink how they are educating residents. 

In an editorial published in Radiology alongside ChatGPT’s test results, authors Ana P. Lourenco, Grayson L. Baird and Priscilla J. Slanetz, from the department of diagnostic imaging at the Warren Alpert Medical School of Brown University and the department of radiology at Boston University Medical Center, suggested that while the chatbot’s scores do reflect its impressive strengths, its weaknesses could present a unique opportunity for educators [1]. 

The authors pointed to the chatbot’s performance on the mock board—it performed well on questions requiring low-order thinking and worse on questions intended to prompt more intense cognitive processing, such as those related to describing of imaging findings, calculation and classification, and application of concepts [2].

“Basically, large language models scoured the internet for words and did well on low-hanging fruit (low taxonomy) questions,” the authors noted. “The key finding is that ChatGPT did not perform well on higher-order taxonomy material and potentially may not be able to become proficient in it.” 

Low-order learning takes less time than higher-order learning. ChatGPT is very skilled in low-order learning—it can answer multiple choice-style questions about radiology quite efficiently, despite never having been trained with radiology-specific data. However, unlike humans, the chatbot is incapable of thinking critically; although it can answer a question correctly, it often cannot explain how it arrived at its conclusion, which is a critical skill for emerging radiologists. 

Unfortunately, low-order learning comes first, and radiology residents must dedicate a considerable amount of time to it before they can reach proficiency in higher-order learning. This focus, the authors suggested, can reduce residents’ time spent building critical thinking skills that set them apart from AI.

“In other words, a human usually learns to handle higher taxonomy content by mastering the low-hanging fruit first,” the group explained. “The metacognitive knowledge is attained only after mastering the factual, conceptual, and procedural knowledge.” 

Due to the way radiology boards are formatted, residents are required to memorize facts and concepts that will enable them to spot correct answers on multiple choice questions, rather than prompting them to reason. The authors cautioned that this “teach to test” inhibits residents' ability to grow cognitively. 

“What is the value of having residents take an examination that ChatGPT can successfully pass (or can surpass the performance of residents)? If we are not careful, AI may ultimately outperform residents—and not just on examinations—if we only teach to the test.,” the group wrote. 

Their solution? In short, they suggested that educators stop the “teach to test” methods and instead encourage students to “to read and spend more time at the view box!” This comes in the form of creating evaluations that require more high-order knowledge, clinical judgement, critical problem-solving skills, etc.  

The American Board of Radiology’s recent announcement that they will be returning to oral board exams is a good example of how better educational initiatives can be achieved. The authors described the ABR move as a “positive” step in the right direction. 

“As AI evolves, radiology education and evaluation must also evolve, adapting in ways that prepare residents over and above the capability of AI,” the authors concluded. 

The study abstract detailing ChatGPT’s test results can be found here, and the accompanying editorial here

Hannah murhphy headshot

In addition to her background in journalism, Hannah also has patient-facing experience in clinical settings, having spent more than 12 years working as a registered rad tech. She began covering the medical imaging industry for Innovate Healthcare in 2021.

Around the web

RBMA President Peter Moffatt discusses declining reimbursement rates, recruiting challenges and the role of artificial intelligence in transforming the industry.

Deepak Bhatt, MD, director of the Mount Sinai Fuster Heart Hospital and principal investigator of the TRANSFORM trial, explains an emerging technique for cardiac screening: combining coronary CT angiography with artificial intelligence for plaque analysis to create an approach similar to mammography.

A total of 16 cardiology practices from 12 states settled with the DOJ to resolve allegations they overbilled Medicare for imaging agents used to diagnose cardiovascular disease.