AI 'not safe' to be implemented as a solo reader for breast cancer screening exams

Despite the great progress that has been made toward the clinical implementation of artificial intelligence, new data caution against trusting the technology as a single reader in certain screening settings. 

The analysis details efforts to deploy a commercially available AI algorithm as a “silent reader” in a double reading breast screening program in the United Kingdom. After their experience using it in this capacity, experts from 1 of 14 sites where the algorithm was implemented suggested the algorithm is “not currently of sufficient accuracy” to be considered for use in the NHS Breast Cancer Screening Program. 

They shared their findings this week in Clinical Radiology. 

In order to gather data without impacting workflows, the algorithm was used as a “third reader” for nearly 10,000 breast cancer screening mammograms. The team recorded all cases flagged by AI but not radiologists (positive discordant) and compared them to the cases the human readers flagged but the algorithm did not (negative discordant).  

A total of 1,135 positive discordant cases were identified. Of those, one patient was recalled for additional imaging but was not diagnosed with cancer. There were 139 negative discordant cases reviewed, revealing eight cancers that were flagged by radiologists but not AI. During its deployment, AI did not detect any additional cancers missed by the radiologists. 

The algorithm had higher recall rates compared to physicians. This is likely due to it not being trained to evaluate prior imaging the way radiologists are trained. Although some algorithms have been trained to review previous mammograms and could improve the recall rates observed in this study, the overall performance still warrants caution when considering using AI as a solo reader. 

“The high rate of abnormal interpretation by AI, if not monitored by human readers, would create a heavy workload for the assessment clinics, unsustainable for U.K. breast screening units,” Shama Puri, MD, a consultant radiologist with University Hospital of Derby and Burton NHS Foundation Trust, and colleagues noted. “When monitored by human readers, it will lead to a significant increase in the number of consensus/arbitrations required to be done by the human readers. This carries the risk of unnecessary increase in the recall rates, particularly when arbitration/consensus is led by less experienced screen readers.” 

In its current form, the group signaled that the algorithm is “not safe” to use as a standalone reader, though they suggested that some refinement could improve its utility.

Learn more about the findings here. 

Hannah murhphy headshot

In addition to her background in journalism, Hannah also has patient-facing experience in clinical settings, having spent more than 12 years working as a registered rad tech. She began covering the medical imaging industry for Innovate Healthcare in 2021.

Around the web

Clinicians have been using HeartSee to diagnose and treat coronary artery disease since the technology first debuted back in 2018. These latest updates, set to roll out to existing users, are designed to improve diagnostic performance and user access.

The cardiac technologies clinicians use for CVD evaluations have changed significantly in recent years, according to a new analysis of CMS data. While some modalities are on the rise, others are being utilized much less than ever before.

The new guidelines were designed to ensure sonographers and other members of the heart team have the information they need to screen patients when appropriate and identify early warnings signs of PH.