Traditional methods continue to outperform AI in some orthopedic scenarios
AI algorithms have proven themselves beneficial in predicting outcomes based on imaging, but a new meta-analysis suggests that when it comes to hip fractures, these tech tools do not always live up to their hype.
A new paper published in JAMA on March 17 details experts’ review of 39 different studies involving AI algorithms said to identify hip fractures and predict postoperative outcomes using X-rays alone. Although the analysis revealed the algorithms to be an effective tool for detecting these fractures, their performance in predicting patient-specific outcomes did not earn the researchers’ endorsement over expert radiologists [1].
“The performance of AI in diagnosing hip fractures was comparable with that of expert radiologists and surgeons,” corresponding author of the new paper Johnathan R. Lex, MB, ChB, with the Division of Orthopaedic Surgery at the University of Toronto, and colleagues noted. “However, current implementations of AI for outcome prediction do not seem to provide substantial benefit over traditional multivariable predictive statistics.”
For the meta-analysis, experts compared traditional statistical models used to predict surgical outcomes to various machine learning models applied to the same out-of-sample dataset. Mortality and length of hospital stay were the most predicted outcomes in studies included in the analysis.
For fracture detection accuracy, ML models rendered performances similar or superior to that of human readers and frequently improved their performances when used as an assistive tool. However, when it came to accurately predicting post-operative outcomes, the ML models did not offer additional benefits over traditional prediction methods, as 60% of relevant studies did not report significant differences in outcomes between the techniques.
The analysis only included studies that utilized plain radiographs, a possible limitation, the authors noted. While hip fractures are often diagnosed via X-ray, it is common for readers to overlook nondisplaced fractures using this method alone. The group explained that most AI models are trained to maximize sensitivity and specificity; they suggested that taking steps to put more emphasis on false negatives could be more beneficial in reducing missed diagnoses when reviewing plain radiographs.
The potential for AI to help diagnose hip fractures in the future is “promising,” but more research with more quality data is needed first, the authors noted.
The study abstract can be viewed here.