This appeared last week:
Explainer:
Thinking through the safe use of AI
Mayo Clinic's Dr. Sonya Makhni answers key questions on
creating and delivering artificial intelligence, inherent bias and the
importance of risk classification systems.
By Andrea Fox
February 26, 2024 03:03 PM
CMS is concerned that biased AI could exacerbate
discrimination, and says algorithms alone cannot be used as a basis to deny
hospital admission for Medicare Advantage patients.
As the use of artificial intelligence expands
across healthcare, there are plenty of justifiable worries about what looks to
be a new normal built around this powerful and
fast-changing technology. There are labor concerns, widespread distress
over fairness, ethics and equity – and, perhaps for some,
fear of a dystopian future where intelligent machines grow too powerful.
But the promise of AI and machine learning is
also enormous: predictive analytics could mean better health outcomes for
individuals and potentially game-changing population health advancements while moving the needle on costs.
Finding a regulatory balance that capitalizes on
the good and protects against the bad is a big challenge.
Government and healthcare leaders are more
adamant than ever about addressing racial bias, protecting safety and "getting it
right." Getting it wrong could harm patients, erode trust and
potentially create legal liabilities for healthcare organizations.
We spoke with Dr. Sonya Makhni, medical director
of the Mayo Clinic Platform and senior associate consultant for the
Department of Hospital Internal Medicine, about recent developments with
healthcare AI and discussed some of the key challenges of tracking performance,
generalizability and clinical validity.
Makhni, an expert on AI speaking at HIMSS24
Global Conference & Exhibition, explained how healthcare AI models
should be assessed for use, offering the use of readmissions AI as one example
of the importance of understanding a specific model's performance.
Q. What does it mean
to deliver an AI solution in general?
A. An
AI solution is more than just an algorithm – the solution also includes
everything you need to make it work in a real workflow. There are a few key
phases to consider when developing and delivering an AI solution.
First is the algorithm design and development
phase. During this phase, solution developers should work closely with clinical
stakeholders to understand the problem to be solved and the data that is
available.
Next, the solution developers can start the
process of algorithm development, which itself comprises many steps such as
data procurement and preprocessing, model training and model testing (among
several other important steps).
Following algorithm development, AI solutions
need to be validated on third party data and ideally performed by an
independent party. An algorithm that performs well on the initial dataset may
perform differently on a different dataset that represents different population
demographics. External validation is a key step in understanding an algorithm’s
generalizability and bias and should be completed for all clinical AI
solutions.
Solutions also should be tested in clinical
workflows, and this can be accomplished through pilot studies, prospective
studies, and trials – and through ongoing real-world evidence studies.
Once an AI solution has been assessed for
performance, generalizability, bias and clinical validity, we can start to
think about how to integrate the algorithm into real clinical workflows. This
is a critical and challenging step, and requires significant
consideration.
Clinical workflows are heterogeneous across
health systems, clinical contexts, specialties and even end-users. It is
important the prediction outputs are communicated to end-users at the right
time, for the right patient, and in the right way. For example, if every AI
solution required the end-user to navigate to a different external digital
workflow, these solutions may not experience widespread adoption. Suboptimal
integration into workflows may even perpetuate bias or worse clinical
outcomes.
It is important to work closely with clinical
stakeholders, implementation scientists, and human-factors specialists if
possible.
Finally, a solution must be monitored and
refined for as long as the algorithm is in deployment. The performance of
algorithms can change over time, and it is critical that AI solutions are
periodically (or in real-time) assessed for both mathematical performance and
clinical outcomes.
Q. What are the
points in the development of AI that can allow bias to creep in?
A. If
leveraged effectively, AI can improve or even transform the way we diagnose and
treat diseases.
However, assumptions and decisions are made
during each step of the AI development life cycle, and if incorrect these
assumptions can lead to systematic errors. Such errors can skew the end result
of an algorithm against a subgroup of patients and ultimately pose risks to
healthcare equity. This phenomenon has been demonstrated in existing algorithms
and is referred to as algorithmic bias.
For example, if we are designing an algorithm
and choose an outcome variable that is inherently biased, then we may
perpetuate bias through the use of this algorithm. Or, decisions made during
the data-preprocessing step might unintentionally negatively impact certain
subgroups. Bias can be introduced and/or propagated during every phase,
including deployment. Involving key stakeholders can help mitigate the risks
and unintended impacts caused by algorithmic bias.
It is likely that almost all AI algorithms
exhibit bias.
This does not mean that the algorithm can’t be
used. It does highlight the importance of transparency in knowing where
the algorithm is biased. An algorithm may perform well in one population and
poorly in another. The algorithm can and should still be used in the
former as it may improve outcomes. It would be best if it was not used with the
population it performs poorly for, however.
Biased algorithms can still be useful, but only
if we understand where it is appropriate and not appropriate to use them.
At Mayo Clinic Platform, we have developed a
tool to validate algorithms and perform quantitative bias assessments so that
we can help end-users better understand how to safely and appropriately use AI
solutions in clinical care.
Q. What do AI users
have to think through when they use tools like readmission AI?
A. Users
of AI algorithms should use the AI development life cycle as a framework to
understand where bias may potentially be introduced.
Ideally, users should be aware of the
algorithm’s predictors and outcome variable if possible. This may be more
challenging to do when using more complex algorithms, however. Understanding
the variables used as inputs and outputs of an algorithm can help end-users
detect erroneous or problematic assumptions. For example, an outcome variable
may be chosen that is itself biased.
End-users should also understand the training
population used during model development. The AI solution may have been trained
on a population that is not representative of the population where the model is
to be applied. This may be an indication to be cautious of the model’s
generalizability. To that end, users should understand how well the algorithm
performed during development and if the algorithm was externally
validated.
Ideally, all algorithms should undergo a bias
assessment – quantitative and qualitative. This can help users understand
mathematical performance in different subgroups that vary by race, age, gender,
etc. Qualitative bias assessments conducted by solution developers can help
alert users to situations that may arise in the future as a result of potential
algorithmic bias. Knowledge of these scenarios can help users better
monitor and mitigate unintentional inequities in performance.
Readmission AI solutions should be assessed on
similar factors.
Specifically, users should understand if there
are certain subgroups where performance varies. These subgroups could consist
of patients of different demographics, or even of patients with different
diagnoses. This will help clinicians evaluate if and when the model’s predicted
output is most appropriate and reliable.
Q. How do you think
about AI risk and risk management?
A. Commonly,
we think about risk as operational and regulatory risk. These pieces relate to
how a digital health solution adheres to privacy, security and regulatory laws
and is critical to any assessment.
We should also begin to consider clinical, as
well.
In other words, we should consider how an AI
solution may impact clinical outcomes and what the potential risks are if an
algorithm is incorrect or biased or if actions taken on an algorithm are
incorrect or biased.
It is the responsibility of both the solution
developers and the end-users to frame an AI solution in terms of risk to the
best of their abilities.
There are likely many ways of doing this, and
Mayo Clinic Platform has developed our own risk classification system to help
us accomplish this where AI solutions undergo a qualification process before
external use.
Q. How can
clinicians and health systems engage in the process of creating and delivering
AI solutions?
A. Clinicians
and solution developers should work together collaboratively throughout the AI
development life cycle and through solution deployment.
More here:
https://www.healthcareitnews.com/news/explainer-thinking-through-safe-use-ai
I found this a useful article which was well worth a read:
David,