Radiologists deliver fewer false-positive results than advanced AI models

A new retrospective study out of Denmark may expose the limitations of artificial intelligence (AI) in assisting radiologists with interpreting chest radiographs. Researchers evaluated the diagnostic accuracy of different AI tools in detecting certain diseases of the lungs – and the results show human radiologists may have an advantage. The full analysis was published in Radiology. [1]

The study's authors conducted a comparative analysis of four commercially available AI tools with a pool of 72 radiologists. They evaluated 2,040 consecutive adult chest X-rays taken over a two-year period at four Danish hospitals in 2020. The median age of the patient group was 72 years, with 32.8% of the chest X-rays exhibiting at least one target finding.

The chest X-rays were assessed for three common findings: airspace disease (often signaling pneumonia or lung edema), pneumothorax (collapsed lung), and pleural effusion (fluid around the lungs).

The AI tools demonstrated sensitivity rates ranging from 72% to 91% for airspace disease, 63% to 90% for pneumothorax, and 62% to 95% for pleural effusion. The specifics reveal a crucial pattern: The AI tools were notably specific and accurate when examining chest X-rays with normal or single findings. However, that reliability dropped significantly for X-rays with four or more findings, signaling that the AI tools appeared to struggle when multiple findings were present on a single image.

The study also found that AI sensitivity was lower when dealing with vague airspace disease and smaller pneumothorax or pleural effusion when compared to larger findings. In short, the study demonstrates that AI does not perform as well as a trained, experienced radiologist in real-life scenarios with diverse patient scans.

“The AI tools showed moderate to a high sensitivity comparable to radiologists for detecting airspace disease, pneumothorax and pleural effusion on chest X-rays,” lead researcher Louis Plesner, MD, PhD, Herlev Hospital and Gentofte Hospital, said in a statement. “However, they produced more false-positive results (predicting disease when none was present) than the radiologists, and their performance decreased when multiple findings were present and for smaller targets.”

Plesner stressed how the being able to exclude disease is crucial to making a proper diagnosis, as false-positives can be costly and burdensome: "AI systems seem very good at finding disease, but they aren't as good as radiologists at identifying the absence of disease, especially when the chest X-rays are complex,” he said. “Too many false-positive diagnoses would result in unnecessary imaging, radiation exposure, and increased costs."

He also highlighted how previous studies showing AI outperforming radiologists were limited, as radiologists were often not given access to a patient’s medical history, nor were they provided access to other studies. Instead, radiologists were tasked with making a diagnosis based only on review of specific images – something AI models can do quite well.

“In everyday practice, a radiologist’s interpretation of an imaging exam is a synthesis of these three data points,” Plesner said. “We speculate that the next generation of AI tools could become significantly more powerful if capable of this synthesis as well, but no such systems exist yet.”

Chad Van Alstin Health Imaging Health Exec

Chad is an award-winning writer and editor with over 15 years of experience working in media. He has a decade-long professional background in healthcare, working as a writer and in public relations.

Trimed Popup
Trimed Popup