This could potentially ease strain on resources, free up time for more doctor patient interaction, and help assist in the development of tailored treatments. But it is important to keep in mind that this is still new technology, and experts are warning that the latest findings are only based on a small number of studies.
This comprehensive review of published studies machines suggests that they are on par with humans says co-author Prof Alastair Denniston, of the University Hospitals Birmingham NHS foundation trust, who adds the results are encouraging but serve as a reality check as to some of the hype surrounding AI.
“There are a lot of headlines about AI outperforming humans, but our message is that it can at best be equivalent,” says Dr Xiaoxuan Liu, the lead author of the study and from the same NHS trust.
Some 20,000 AI studies were found in the initial search, but only 14 of those were based on human study that reported good quality data, tested the deep learning system with images from a separate dataset to those used to train it, and showed the same images to human experts.
Results were pooled together to reveal that deep learning systems had correctly detected a disease state 87% of the time, and compared it with healthcare professionals detecting at 86%; as well as AI correctly giving the all clear 93% of the time, compared to 91% for human experts. It was noted that the human experts in these scenarios were not given any additional patient which could have steered their diagnosis.
“This excellent review demonstrates that the massive hype over AI in medicine obscures the lamentable quality of almost all evaluation studies. Deep learning can be a powerful and impressive technique, but clinicians and commissioners should be asking the crucial question: what does it actually add to clinical practice?” says Prof David Spiegelhalter, the chair of the Winton centre for risk and evidence communication at the University of Cambridge who says that the field is awash with poor research.
Denniston remains optimistic regarding the potential of AI for healthcare which could act as a diagnostic tool and help tackle backlogs of imaging and scans. Deep learning systems may even prove to be useful in places that lack in experts to interpret imaging. However, it is important to use these deep learning systems in clinic trial to assess whether patient outcomes would be improved compared to current practices.
Deep learning systems could be an important part of healthcare in the future, but robust real world testing is required, as well as understanding why these systems sometimes make the wrong assessment. “If you are a deep learning algorithm, when you fail you can often fail in a very unpredictable and spectacular way,” says Dr. Raj Jena, oncologist at Addenbrooke’s hospital in Cambridge.