Natural language processing can limit report discrepancies between AI and radiologists

Hannah Murphy | July 13, 2023 | Health Imaging | Enterprise Imaging

Neither radiologists nor artificial intelligence algorithms have a perfect track record when it comes to diagnostic accuracy, but quality assurance measures can help to bridge the gap in discordance between the two.

A new paper in the Journal of the American College of Radiology details how one radiology department was able to implement natural language processing software into its workflow to resolve inadvertent discordance between physicians and an AI decision support system (AI DSS). The software was used to flag certain CT exams when radiologists’ findings differed from that of the AI DSS or in instances when radiologists did not engage with decision support at all. While the NLP software did not often detect discordance between radiologists and AI DSS, it did uncover missed diagnoses on some high acuity CT scans that would have been consequential for patients had the findings not been identified.

The team highlighted the importance of understanding radiologists’ uptake of AI DSS software and how it affects clinical workflows.

“Once implemented in clinical practice, quality assurance and monitoring processes need to be embedded into the AI augmented radiology clinical workflow,” corresponding author M. Chekmeyan, with the Department of Radiology at UMass Chan Medical School, UMass Memorial Medical Center, and colleagues explained. “There is a limited body of literature comprised of studies that detail individual AI QA workflows applied in a retrospective fashion but a paucity of literature for real world guidance on what this looks like when prospectively initiated, on an institutional level.”

For their study, the group included all high acuity CT scans that took place at their institution over a 2.5 year timeframe. Both radiologists and AI DSS interpreted the images for intracranial hemorrhage, cervical spine fracture and pulmonary embolus. The scans were flagged if they were reported as negative by a radiologist, had a high probability of positive by AI DSS and when readers did not engage with decision support.

Of 111,674 scans, the workflow uncovered missed diagnoses just .02% of the time. A total of 12,412 CTs were prioritized as positive by AI DSS, .04% of which were discordant, unengaged, and flagged for QA; of those, 57% were true positives.

In 85% of discordant cases, addendums were made and communicated within 24 hours of being flagged.

While discrepancies between radiologists and AI DSS were rare in this study, the authors maintained that the NLP-based software’s ability to rapidly resolve discordance could limit the potential for missed diagnoses that would be very consequential on high acuity exams.

The study abstract is available here.

The good and the bad of synoptic radiology reports

Structured reports with a 'forcing function' for recommendations improve follow-up adherence

How NLP can 'revolutionize' structured reporting

Hannah Murphy

In addition to her background in journalism, Hannah also has patient-facing experience in clinical settings, having spent more than 12 years working as a registered rad tech. She began covering the medical imaging industry for Innovate Healthcare in 2021.

Around the web

Cardiovascular Business

GE HealthCare launches new cardiac CT scanner with advanced AI capabilities

GE HealthCare designed the new-look Revolution Vibe CT scanner to help hospitals and health systems embrace CCTA and improve overall efficiency.

Cardiovascular Business

Bracco updates HeartSee coronary flow capacity software with new diagnostic features

Clinicians have been using HeartSee to diagnose and treat coronary artery disease since the technology first debuted back in 2018. These latest updates, set to roll out to existing users, are designed to improve diagnostic performance and user access.

Cardiovascular Business

Key trends in diagnostic heart testing: CT on the rise as some traditional techniques fall out of favor

The cardiac technologies clinicians use for CVD evaluations have changed significantly in recent years, according to a new analysis of CMS data. While some modalities are on the rise, others are being utilized much less than ever before.

Natural language processing can limit report discrepancies between AI and radiologists

Related:

The good and the bad of synoptic radiology reports

Structured reports with a 'forcing function' for recommendations improve follow-up adherence

How NLP can 'revolutionize' structured reporting

Related Content

Around the web