American-made GPT-4 knows no borders, translates radiology reports into different languages
GPT-4—one of OpenAI’s latest versions of its popular large language model ChatGPT—can ably simplify radiology reports so that patients can more easily understand them, even when English is not their first language.
OpenAI is an American artificial intelligence company and, as such, ChatGPT in its various forms has been tested mostly in English-speaking contexts. While the benefits of the LLM have been proven in the United States numerous times before, new research indicates its performance holds up in non-English-speaking healthcare settings as well.
Published in the European Journal of Radiology, the study highlights the potential of GPT-4 to enhance radiologists’ communication of findings in clinical environments across the pond. The research took place in the Netherlands, where laws have required healthcare institutions to give patients open access to their electronic health records since 2020. This is the first study to test GPT-4's applicability in this sort of global context.
“This could be a crucial addition to the EHR’s open patient portal because it is well known that patients tend to overestimate their true understanding of health information, resulting in incorrect understanding of reports,” Denise E. Hilling, with Erasmus MC Cancer Center at University Medical Center Rotterdam, and colleagues note. “To our knowledge this is the first study that demonstrates that EHR information can be translated by GPT-4 in another language than English, underscoring the unique contribution of the RADiANT (Radiology Artificial Intelligence Navigation and Translation) tool in bridging this gap globally.”
For their work, researchers meticulously engineered prompts that tasked GPT-4 to translate radiology reports into lay language that patients could more easily comprehend. Two abdominal radiologists reviewed the LLM’s outputs for accuracy, completeness and patient suitability, while a third rad validated the final versions. The resultant reports were presented to 12 colorectal cancer patients, who were later interviewed on their understanding of the reports’ contents.
According to the radiologists, GPT-4's simplified reports were highly accurate, yielding an average score of 3.3 out of 4 possible points. The simplified reports provided substantial benefits for patients as well; on the original reports, patients’ comprehension was scored at a 2.0, but these scores climbed to 3.28 (simplified reports) and 3.50 (summaries) for GPT-4's versions. Overall, patients’ satisfaction with the LLM reports was rated 8.3 out of 10, with most signaling preference for the simplified versions.
“The simplification of complex medical language plays a crucial role in fostering greater patient literacy, which is an essential step toward empowering patients to actively engage in their care,” the authors write. “This responsible use of AI, by providing accessible and understandable information, supports informed decision-making and enhances the overall quality of healthcare communication.”
With continuous validation, refinement and calibration, the group is hopeful that LLMs like GPT-4 can significantly improve both provider and patient experiences within a healthcare system.
The study abstract is available here.