AI generates concise, accurate radiology reports that rival humans’
The GPT-4 artificial intelligence model, the backbone of popular chatbot ChatGPT, showed it can generate radiology reports that are comparable in quality to radiologists’ reports at a more efficient rate.
The results of a study commissioned by the National Institutes of Health (NIH) suggest GPT-4 has the potential to contribute significantly to the standardization and optimization of radiology reporting, providing more concise reports than those typically created by a trained radiologist. The results are published in European Radiology. [1]
Researchers from the NIH aimed to compare the quality and content of radiologist-generated reports with those generated by GPT-4. For the study, 100 anonymized radiology reports were randomly selected and analyzed. Each report underwent processing by GPT-4, which generated new reports, analyzing images by pulling data from its knowledge base.
Researchers assessed the similarities and differences between the radiologist-generated and AI-generated reports, focusing on aspects such as clarity, ease of understanding, structure, word count, and content similarity. The GPT-4 reports were similar in content to the radiologists’ reports, with similar word usage and sentence structure.
However, the radiology reports generated by the AI were more efficient without sacrificing accuracy, and while the study was not looking at the speed of report generation, AI was able to generate its shorter reports very quickly. While sentence length of the GPT-4 reports were inconsistent and variable, it was limited to anecdotal cases. When looking at the larger picture, AI managed to say more with less, specifically at a rate of 34.53 fewer words and 174.22 fewer characters per report on average.
“The findings of this study suggest that GPT-4 (Generative Pre-trained Transformer 4), an advanced AI model, has the potential to significantly contribute to the standardization and optimization of radiology reporting, offering improved efficiency and communication in clinical practice,” wrote the authors led by Amir Hasani of the NIH.
While Hasani and his colleagues warn any implementation of GPT-4 or ChatGPT in a clinical environment requires major ethical considerations, their study adds to a growing body of work that suggests that GPT-4 could be a reliable tool for standardizing radiology reports, improving efficiency while reducing the clerical burden of manually writing them.
The full study can be found here.