For better and for worse, our healthcare system is built around physicians. For the most part, they’re the ones we rely on for diagnoses, for prescribing medications, and for delivering care. And, often, simply for being a comfort.
Created by Bing
Unfortunately, in 2023, they’re still “only” human,
and they’re not perfect. Despite best intentions, they sometimes miss things, make
mistakes, or order ineffective or outdated care. The order of magnitude for
these mistakes is not clear; one
recent study estimated 800,000 Americans suffering permanent disability or
death annually. Whatever the real
number, we’d all agree it is too high.
Many, myself included, have high hopes that appropriate
use of artificial intelligence (AI) might be able to help with this problem. Two new studies offer some considerations for
what it might take.
The first study, from a
team of researchers led by Damon Centola,
a professor at the Annenberg School for Communication at the University of
Pennsylvania, looked at the impact of “structured information–sharing networks
among clinicians.” In other words,
getting feedback from colleagues (which, of course, was once the premise behind
group practices).
Long story short, they work, reducing diagnostic errors
and improving treatment recommendations.
Study co-author Elaine Khoong of UCSF says,
“We are increasingly recognizing that clinical decision-making should be viewed
as a team effort that includes multiple clinicians and the patient as well.” The researchers made sure that the structured
network included clinicians of various ages, specialties, expertise, and
geographical locations, trying to ensure that it was not simply a top-down, hierarchical
network.
Professor Centola believes: “egalitarian online networks increase the diversity of voices influencing clinical decisions. As a result, we found that decision-making improves across the board for a wide variety of specialties.” Best of all, he notes:
The big risk with these information-sharing networks is that while some doctors may improve, there could be an averaging effect that would lead better doctors to make worse decisions. But, that’s not what happens. Instead of regressing to the mean, there is consistent improvement: The worst clinicians get better, while the best do not get worse.
The researchers think this approach could be easily adopted,
building on existing e-consult technologies: “We anticipate, for instance, that
instead of sending clinical cases to a single specialist, clinicians may instead
submit cases to a network of specialists who participate in a structured
information exchange process before providing a recommendation to the referring
clinician.” Professor Centola points out that,
while the networks need to be structured thoughtfully, they don’t have to be
huge; in fact, 40 is ideal. “The
increasing returns above that - going, say, from 40 to 4,000 - are minimal,” he
says.
It's worth pointing out that the anonymous clinicians
in the structured networks were, in this case, human; an interesting follow-up
would be to see what happens when some or even all of the recommendations come
from AI.
Which leads to the second
study, from a team of researchers from MIT and Harvard, which looked at
what happens when radiologists get assistance from AI. Long story short: not much.
As Professor Rajpurkar said in a
lengthy Twitter thread: “Why?
Radiologists implicitly discount AI predictions, favoring their own judgment -
a bias we call "automation neglect"”
The “automation neglect”
comes from radiologists discounting the AI probabilities by around 30% relative
to their own assessments. The radiologists also tended to view their
recommendations and the AI predictions as independent, when, in fact, they are
based on the same data.
The paper found: “We
find that AI assistance does not improve human’s diagnostic quality even though
the AI predictions are more accurate than almost two-thirds of the participants
in our experiment.” To make things worse, “radiologists are slower when
provided with AI assistance.”
Slower but not more
accurate is not a winning combination, and definitely not what we might have
expected.
Created by Bing |
The researchers are
forced to conclude: “Our results demonstrate that, unless the documented
mistakes can be corrected, the optimal solution involves assigning cases either
to humans or to AI, but rarely to a human assisted by AI.” Professor Rajpurkar notes: “While AI holds promise, thoughtfully accounting for how humans
actually use AI is critical. Our work provides concrete evidence on biases and
costs that should inform system design.”
An open question the
researchers posit is “whether the benefits from AI-specific training for
radiologists and/or experience with AI are large.” I.e., can humans learn to work better with
AI?
Given the results of
the first study, I’d have been interested to see what would have happened if
the second study had also tested getting recommendations not from AI but from a
structured network of human physicians; did the radiologists discount just AI recommendations,
or do they just not trust external recommendations generally?
At the risk of giving it
short shrift, a third
study, from Fabrizio Dell’Acqua
at the Harvard Business School, suggests that when AI is too good, humans tend
to “fall asleep at the wheel,” leading
him to conclude: “maximizing human/AI performance may
require lower quality AI, depending on the effort, learning, and skillset of
the humans involved.” There
is a lot about human/AI interaction we do not yet understand.
-----------------
We’ve long looked at
medicine as an “art,” allowing and even encouraging individual physicians to use
their best judgement. That has led to
well documented variability of care and outcomes, much of which is not in
patients’ best interests. There’s too
much for physicians to know, there’s too many extraneous factors influencing
their decisions, and, at this point, there’s way too much money at stake. They need help.
In 2023, clinical decision-making
should be, as Professor Khoong noted, a team effort. We have the ability now for that team to be human
“equalitarian online networks,” as Professor Centola and his colleagues urge,
and we increasingly will have the ability for such networks to include, or to be
replaced by, AI. One way or the other, we need to “thoughtfully
account” for how and when physicians use them.