VIDEO: Validation monitoring for radiology AI to ensure accuracy

"Radiologists want to know AI will be safe and effective for their patients. In other words they want to make sure its trust worthy," Allen explains in the interview with Health Imaging.

He said there are lingering concerns that medical imaging AI algorithms may have been validated on datasets that were too small, limited by single center evaluations using specific scanners and settings, or with limited diversity in the datasets. Allen said his hospital in Alabama has a much different patient population with a very high rate of obesity and metabolic issues as compared to patients from a hospital in Utah or the West Coast. 

"It's not surprising that an algorithm performs well in a test environment," Allen explained.

He said AI vendors will tell you their products work better than average quality benchmarks based on their testing. "Well that's great, but come and see what will work at our practice in Alabama where we have rampant hypertension and obesity," Allen said. 

He explained there are not standard industry pathways to validate AI to offer assurances that it will work and be trustworthy in all settings. There are variables in patient populations, between males and females, by region, ethnicity and socio-economic levels, he said. 

"We are recommending to the end users that they evaluate the models using their own patient data and by using an enriched dataset," Allen explained. "We want them to put in the hard cases so they can say 'OK, the AI found this one and this was a case that was hard for me.' Equally important is the real-world monitoring."

He said data drift, changes in imaging equipment used, settings on the scanners, and software updates might have an impact on the AI software. So, like the scanners themselves, or even the PACS monitors used to read studies, QA needs to be performed to guarantee the AI is working properly. This is especially important when it is relied on for critical findings, computer-aided detection (CAD), when modifying images for things like bone removal, or performing complex qualification that is used to determine patient treatments or next steps. 

"The AI is always going to work best the day you plug it in, and then there is going to be some performance degradation over time. So keeping track of that and understanding why it breaks is important," Allen said. 

The ACR created the ACR Assess-AI Registry and its AI-Lab website to help health systems track the AI performance and helping finding the cause of any issues if the AI does not perform as intended, Allen explained. 

Like the medical imaging radiation dose index registry and others created by the ACR, he said these are tools to help set baselines and monitor quality over time. And not just for one site or health system, but for all of radiology, because the pooled data will help identify problems and solutions over time. 

Allen is also a radiologist at the Birmingham Radiological Group and the radiology residency program director at Brookwood Baptist Health in Birmingham. He practices at Grandview Medical Center in Birmingham, Alabama.

Dave Fornell is a digital editor with Cardiovascular Business and Radiology Business magazines. He has been covering healthcare for more than 16 years.

Dave Fornell has covered healthcare for more than 17 years, with a focus in cardiology and radiology. Fornell is a 5-time winner of a Jesse H. Neal Award, the most prestigious editorial honors in the field of specialized journalism. The wins included best technical content, best use of social media and best COVID-19 coverage. Fornell was also a three-time Neal finalist for best range of work by a single author. He produces more than 100 editorial videos each year, most of them interviews with key opinion leaders in medicine. He also writes technical articles, covers key trends, conducts video hospital site visits, and is very involved with social media. E-mail: dfornell@innovatehealthcare.com

Trimed Popup
Trimed Popup