The content of this website is intended for healthcare professionals only

AI demo claims technology is as good as real GPs in diagnosis

GP leaders say apps/algorithms should never replace GPs

Adrian O'Dowd

Thursday, 28 June 2018

Technology firm Babylon Health has demonstrated* that its latest artificial intelligence (AI) appears to be as good, if not better, then real GPs at diagnosing and providing health advice to patients.

Doctors’ leaders, however, have rejected these claims saying that the tests were “over-simplistic and crude” and that apps and algorithms should never replace GPs.

In a presentation broadcast last night from London, Babylon – the company behind the NHS GP at Hand app – showed results of how its AI technology had been tested using elements from the MRCGP exams that related to diagnostics.

The company took a representative sample of questions from publicly available RCGP sources as well as independently published examination preparation materials, and mapped these to the current RCGP curriculum in order to ensure the questions resembled actual MRCGP questions as closely as possible.

Babylon said that the average pass mark over the past five years for real-life doctors was 72% and in sitting the exam for the first time, Babylon’s AI scored 81%.

As the AI continued to learn and accumulate knowledge, Babylon said it expected that subsequent testing would produce significant improvements in terms of results.

Given that doctors were presented with a much wider range of illnesses and conditions in their daily practice, Babylon’s team of scientists, clinicians and engineers wanted to test their AI’s capabilities further.

Therefore, they collaborated with the Royal College of Physicians, Stanford Primary Care and Yale New Haven Health to test Babylon’s AI alongside seven highly-experienced primary care doctors using 100 independently-devised symptom sets (or “vignettes”).

Babylon’s AI scored 80% for accuracy, while the seven doctors achieved an accuracy range of 64-94%, said the company.

The accuracy of the AI was 98% when assessed against conditions seen most frequently in primary care medicine. In comparison, when Babylon’s research team assessed experienced clinicians using the same measure, their accuracy ranged from 52-99%.

Dr Ali Parsa, Babylon’s founder and chief executive said: “Even in the richest nations, primary care is becoming increasingly unaffordable and inconvenient, often with waiting times that make it not readily accessible.

“Babylon’s latest artificial intelligence capabilities show that it is possible for anyone, irrespective of their geography, wealth or circumstances, to have free access to health advice that is on-par with top-rated practicing clinicians.”

Professor Martin Marshall, vice chair of the RCGP, said: “The potential of technology to support doctors to deliver the best possible patient care is fantastic, but at the end of the day, computers are computers, and GPs are highly-trained medical professionals: the two can’t be compared and the former may support but will never replace the latter.

“No app or algorithm will be able to do what a GP does. An app might be able to pass an automated clinical knowledge test but the answer to a clinical scenario isn’t always cut and dried, there are many factors to take into account, a great deal of risk to manage, and the emotional impact a diagnosis might have on a patient to consider.”

Professor Marshall added that the exam-preparation materials used by Babylon in this research would have been compiled for revision purposes and were not necessarily representative of the full range of questions and standard used in the actual MRCGP exam.

Dr Richard Vautrey, BMA GP committee chair, said: “AI may have a place in the tools doctors use to support and treat patients, but it cannot replace the essential elements of the doctor-patient relationship which is at the heart of medicine.

“Medicine is far more than diagnosis, and is certainly more than an algorithm. Technology has great potential to provide tools that can assist clinicians and patients, but it will never replace the physical presence and interaction with a GP, who has skills developed over years of training and experience in the consultation room.

“This study represents an over-simplistic and crude measure of what it means to be a good doctor, which is about so much more than passing an exam.”

*Razzaki S, Baker A, Perov Y, et al. A comparative study of artificial intelligence and human doctors for the purpose of triage and diagnosis. Babylon Health, June 2018.

Registered in England and Wales. Reg No. 2530185. c/o Wilmington plc, 5th Floor, 10 Whitechapel High Street, London E1 8QS. Reg No. 30158470