- The researchers insist that they require advanced changes in the AI models.
- The AI models were trained without sufficient images that would help with the patient diagnosis.
Introduction:
As the Global pandemic set in startups like DarwinAI, major companies like Nvidia, and groups like the American College of Radiology, launched initiatives to detect covid-19 using CT scans, X-rays, and other forms of medical imaging. Such technology promised that it could help healthcare practitioners distinguish between pneumonia and covid-19 or offer more options for the diagnosis of a patient. They also develop certain models to predict whether a person will die or require a ventilator based on a CT scan. However, the researchers insist that they need to implement big changes before such forms of machine learning can be used in a clinical setting.
Examination of papers and their analysis:
Researchers examined more than 2200 papers while eliminating duplicates and irrelevant titles, they narrowed down the results to 230 papers that underwent a full-text review for quality assessment. Finally, 62 papers qualified to be part of a systematic review of published research and preprints shared on open research paper repositories like arXiv, bioRxiv, and medRxiv. The included 62 papers in the analysis, where roughly half of them did not attempt to conduct external validation of training data, did not assess model sensitivity or robustness, what did not report the demographics of people represented in the training data.
“Frankenstein” data sets the ones made with duplicate images obtained from other data sets. Here they are also found posing a common problem. Only one in five covid-19 diagnosis or prognosis models shared their code so others can duplicate results claimed in the literature.
“In their current reported form, none of the machine learning models included in this review are likely candidates for clinical translation for the diagnosis/prognosis of COVID-19,” the paper reads.
Developing machine learning models for COVID-19:
“Despite the huge efforts of researchers to develop machine learning models for COVID-19 diagnosis. We found methodological flaws and many biases throughout the literature, leading to highly optimistic reported performance.”
They published the research last week as a part of the March issue of Nature Machine Intelligence by researchers from the University of Cambridge and University of Manchester. Among other common issues, they found with machine learning models which were developed using medical imaging data were virtually no assessment for buyers and were being trained without enough images. The publicly available datasets also suffered from low-quality image formats. But, they were not big enough to train the AI models. Researchers used the checklist for artificial intelligence in medical imaging (CLAIM) and radionics quality score for assessment of the data sets and the models.
Statement from AI researchers and healthcare professionals:
“The urgency of the pandemic led to many studies using datasets that contain obvious biases or are not representative of the target population. Before evaluating a model, authors must report the demographic statistics for their datasets, including age and sex distributions,” the paper reads.
“Higher-quality datasets, manuscripts with sufficient documentation to be reproducible and external validation are required to increase the likelihood of models being taken forward and integrated into future clinical trials to establish independent technical and clinical validation as well as cost-effectiveness.”
Suggestions from the AI researchers:
The AI researchers and healthcare professionals suggested other recommendations. It includes and shares the reproducibility of model performance results that spelled out in research papers to consider how data sets are assembled. In other news at the intersection of covid-19 and machine learning earlier this week, Food and Drug Administration (FDA) approved the emergency authorization of a machine learning-based screening device which is the first approved device in the U.S.