How does AI TI-RADS compare to other thyroid nodule risk stratification systems?
A new artificial intelligence-based system for risk stratification of thyroid nodules on ultrasound could cut back on radiologists’ workloads while also sparing patients of unnecessary procedures.
Thyroid nodules are common imaging findings, with many of them requiring additional exams to rule out malignancy and very few confirmed to be cancer. To address this issue, multiple medical organizations have come up with their own versions of Thyroid Imaging Reporting and Data Systems, or TI-RADS.
The American College of Radiology’s version is considered to be the most accurate, though interobserver variability remains a challenge when classifying nodules with the system. Recently, the ACR modified their system with the help of an artificial intelligence algorithm—AI TI-RADS—to optimize performance.
“AI TI-RADS uses the exact same feature categories and descriptions as ACR TI-RADS (composition, echogenicity, shape, margin and echogenic foci), but changed point values for some individual features, and thus may result in a different overall score and a different recommendation for a given nodule,” corresponding author Jichen Yang, from the department of electrical & computer engineering at Duke University, and colleagues explained. “Several features were assigned zero points, ultimately allowing users to distinguish between fewer features.”
An initial assessment of the system showed that it could perform similarly to human radiologists. That test was on a small sample size and took place at a single institution, however, creating a need for the system to be further validated on larger datasets.
In this latest analysis, experts tested AI TI-RADS on 378 thyroid nodules from 320 patients. Three radiologists scored features according to the AI TI-RADS lexicon using ultrasound images, and recommendations for fine needle aspiration (FNA) using AI TI-RADS and ACR TI-RADS as guidance were then compared.
Across all readers, AI TI-RADS yielded slightly lower sensitivity compared to ACR TI-RADS (0.69 vs 0.72) but higher specificity (0.40 vs 0.37). AI TI-RADS resulted in fewer points assignments for all three readers, including more values of zero for multiple features.
Specifically, the features “taller than wide” and “macrocalcifications” more often received values of zero, which designates them as lower risk and could contribute to the lower sensitivity observed in the study.
“These data highlight the importance of continued scrutiny of any classification scheme and the tradeoff between sensitivity and specificity,” the group suggested. “When evaluating a disease that is often indolent, increased specificity may be favorable to avoid unnecessary FNAs. However, a precise quantification of the potential tradeoff is crucial to make an informed decision.”
The group also highlighted the simplicity of the system, noting that it eliminates the need for radiologists to look for or become well versed in analyzing certain features, which saves them time during interpretations. Further, AI TI-RADS also proposed assigning zero points to macrocalcifications—something that contradicts ACR TI-RADS.
“The implication of this suggestion is that, according to the data on which AI TI-RADS was founded, macrocalcifications do not confer as much risk of malignancy as initially thought,” the authors explained. “As with many TI-RADS features, interobserver variability can hinder consistent interpretation, and eliminating the need to assign points to macrocalcifications could help improve the system.”
Though the authors acknowledged that the system will need further validation, they suggested that their results, along with those of other recent studies, highlight the promise of AI TI-RADS to simplify the risk stratification of thyroid nodules in the future.
The study abstract can be found here.