Lack of transparency in AI research limits reproducibility, renders work 'worthless'
A lack of transparency in artificial intelligence research can make it difficult to apply in the real-world, rendering the work “worthless” when the results—no matter how positive—are not reproducible.
A new analysis recently shared in Academic Radiology found that a significant amount of studies do not provide information pertaining to their raw data, source code or model. As a result, up to 97% of these studies do not produce systems that are fit to be used in real-world clinical scenarios, according to the experts' data.
Corresponding author Burak Kocak, MD, of Basaksehir Cam and Sakura City Hospital in Turkey and colleagues explained that since code and data are so intertwined in AI research, that information should be made more readily available when studies are published.
“Through this, a wider scientific community can build upon and improve the original work. Otherwise, the contribution of the researchers to the AI field will be ineffective, simply being no more than “show-and-tell” projects,” the authors suggested.
Kocak and colleagues included 194 radiology and nuclear medicine research papers on AI in their analysis. Raw data was available for around 18% of these papers, but private data was accessible in just a single paper from the entire batch. A shortage of modeling information in the AI studies was also observed, with just around one-tenth of them sharing their pre-modeling, modeling and post-modeling files.
The authors attributed the nearly non-existent availability of private data to the regulatory hurdles that must be overcome to address privacy concerns, acknowledging that it can be a tedious process.
“It is time-consuming and labor-intensive [if], for instance, an institutional review board approval is required. In addition, there can be adverse consequences related to patient privacy, financial investments, and ownership of the intellectual property if the data sharing is not done properly,” they explained.
There are, however, solutions to address privacy concerns when data cannot be made publicly available, such as allowing permissions to designated independent investigators for validation purposes, the experts suggested, adding that similar approaches can be taken when sharing modeling code as well.
The authors also suggested that manuscript authors, peer-reviewers and journal editors could play a role in making AI studies reproducible in the future by being more cognizant of transparency, data, code and model availability when publishing research results.
The study abstract is available in Academic Radiology.