AI model with single-source dataset outperforms multi-institution version
An AI program trained on MRI scans from a single institution yielded comparable results when compared to one trained with a much larger dataset pulled from multiple sources. The results are published in Clinical Imaging. [1]
In a study out of the University of Alabama at Birmingham, researchers aimed to assess the performance of an AI model in segmenting kidney and cyst regions in MR images of patients with autosomal dominant polycystic kidney disease (ADPKD). The team used the widely popular U-Net deep-learning architecture, proven useful in segmenting images, adding custom code from open-source AI and Python databases to finalize their model.
The custom U-Net model was then trained on two datasets for comparison, one from a single institution and another trained on 756 MRI images from four institutions.
The U-Net trained with a small sample size from a single institution yielded variable accuracy when segmenting images from different organizations. However, when the AI trained on a single institution was tested for the images from that same institution, it yielded comparable segmentation accuracy to the one trained with a larger multi-institutional dataset.
When both were tasked with segmenting ADPKD kidney regions in MRI images, the U-Net model trained on a single institution's data demonstrated comparable accuracy to the one trained with a larger multi-institution collection of information. However, the results varied when the single-source AI was tasked with segmenting images taken from an institution outside its dataset.
“The single institutional data has less variations in MRI scanners/coils, imaging parameters, and image pre-processing techniques, allowing the model to offset the generality of a larger model for the specificity in a smaller model in segmenting early ADPKD,” the authors led by Emma Schmidt of the University of Alabama at Birmingham wrote. “As long as the imaging hardware and protocol remain consistent, the model trained with single institutional data may be effective for kidney segmentation in the MRI images of ADPKD patients obtained from the same institute.”
The evaluation of performance using the Dice Similarity Coefficient (DSC) revealed that, for kidney image segmentation, the AI trained with a single institutional dataset were only 1%-2% lower than those trained with a multi-institutional dataset. However, in cyst segmentation, the DSCs for the model trained with a single institutional dataset were 2%-20% lower than those trained with a multi-institutional dataset.
Despite performing well when looking at a narrow set of images, the authors said that AI trained on a larger dataset from multiple institutions may still be preferable.
“As the severity of the volume and number of cysts increase, models trained on significantly larger or multi-institutional datasets may be necessary to ensure the highest model accuracy and segmentation results,” the authors wrote, adding that the models “may need to be continuously updated to compensate for the intra-scanner variability.”
The authors warned that the custom U-Net model trained on a single institutional dataset “may not be reliable for accurate segmentation,” given its limitations.
“To improve the accuracy in cyst segmentation, a more extensive dataset should be employed and/or different deep learning algorithms may need to be explored,” the authors concluded.