AI reveals more variation in free-text than standardized radiology reports
A natural language processing and machine learning-based algorithm may successfully evaluate inter-radiologist report variation and compare variation between radiologists using highly-structured versus free-text reporting, according to research published Oct. 9 in Current Problems in Diagnostic Radiology.
Lead author Lane F. Donnelly, MD, chief quality officer at Lucile Packard Children's Hospital Stanford in California, and colleagues used the algorithm to evaluate more than 28,000 radiology reports for verbosity, observational terms, unwarranted negative findings and repeated language. Additionally, a standardized template report for an ultrasound examination and a free-text report for a chest radiograph were analyzed and compared.
“For each metric, the mean and standard deviation for defined outlier results for all dictations (individual and group mean) was calculated. The mean number of outlier metrics per reader per study was calculated and compared between radiologists and between the two report types,” according to the researchers, adding that the radiologists were also ranked based on the number of metrics they identified in each study.
The researchers found that metric values and variability were greater on radiology reports using free-text reporting more than structured ones. Variability in reports, the researchers added, may hinder effective communication in radiology departments and hospitals and negatively impact patient care.
“This study demonstrates that natural language processing and machine learning algorithms can be used to evaluate significant volumes of radiology reports for metrics which could be used for tasks such as quality control, teaching, and as feedback and learning materials for practicing radiologists,” Donnelly et al. wrote. “This study also demonstrates and confirms that there is high variability in radiologist dictation styles based on the parameters evaluated.”
Although the study demonstrated that standardized reports may have improved communication compared to free-text dictation reports, the researchers noted that high amounts of variability in radiology reporting still exist even when using standardized templates.
“Future studies should evaluate whether the providing of this type of data, comparing an individual radiology provider to their peers would be helpful to altering outlier behavior and driving further standardization,” the researchers wrote.