'Fictitious references' and 'significant inaccuracies' could hinder ChatGPT's medical writing career
Despite being widely praised by many of its users, ChatGPT recently left much to be desired by experts who compared the chatbot’s medical writing alongside that of seasoned professionals.
Since its release in 2022, ChatGPT has been on a campaign to charm users in all corners of the internet, including medical specialists. In recent months ChatGPT was published in the journal Radiology, it wrote a nuclear medicine report that garnered much enthusiasm, and it also recently generated relevant and (mostly) accurate recommendations when prompted to discuss guidelines pertaining to breast cancer screening.
While it has been shown to provide accurate and relevant medical information in numerous contexts, oversight remains crucial. A recent example of why this is the case was detailed in a new paper in Skeletal Radiology.
There, experts described their experience with the chatbot, sharing that it frequently provides false data from fictitious sources—something that could be problematic for untrained readers who come across ChatGPT-generated articles.
“ChatGPT is able to generate coherent research articles, which on initial review may closely resemble authentic articles published by academic researchers,” corresponding author of the paper Rajesh Botchu, with the Department of Musculoskeletal Radiology at the Royal Orthopedic Hospital in the United Kingdom, and colleagues explained.
Botchu and colleagues compared ChatGPT-generated academic radiology articles alongside similar work written by humans that was either already published or under review. Two fellowship-trained musculoskeletal radiologists reviewed the articles and rated them on a scale from 1 to 5, with a score of 1 indicating poor quality and inaccuracies and 5 indicating excellent quality.
Out of the five ChatGPT articles included in their analysis, four were “significantly inaccurate with fictitious references.” The remaining paper was well written and included good information, the authors noted, but it also shared fictitious references.
These findings add to the mounting concerns regarding the potential harm that could come from inaccurate medical information produced by ChatGPT, the authors suggested, adding that to the untrained reader, the information provided could be perceived as legitimate.
The study abstract is available here.