(Press-News.org) OAK BROOK, Ill. – Radiologists, computer scientists and informaticists outline pitfalls and best practices to mitigate bias in artificial intelligence (AI) models in an article published today in Radiology, a journal of the Radiological Society of North America (RSNA).
“AI has the potential to revolutionize radiology by improving diagnostic accuracy and access to care,” said lead author Paul H. Yi, M.D., associate member (associate professor) in the Department of Radiology and director of Intelligent Imaging Informatics at St. Jude Children’s Research Hospital in Memphis, Tennessee. “However, AI algorithms can sometimes exhibit biases, unintentionally disadvantaging certain groups based on age, sex or race.”
While there is growing awareness of this issue, there are challenges associated with the evaluation and measurement of algorithmic bias.
In the article, Dr. Yi and colleagues identify key areas where pitfalls occur, as well as best practices and initiatives that should be taken.
“Despite the significant attention this topic receives, there’s a notable lack of consensus on key aspects such as statistical definitions of bias, how demographics are categorized, and the clinical criteria used to determine what constitutes a ‘significant’ bias,” Dr. Yi said.
The first such pitfall is the lack of representation in medical imaging datasets. Datasets are essential for the training and evaluation of AI algorithms and can be comprised of hundreds of thousands of images from thousands of patients. Many of the datasets lack demographic information, such as race, ethnicity, age and sex.
For example, in a previous study performed by Dr. Yi and colleagues, of 23 publicly available chest radiograph datasets, only 17% reported race or ethnicity.
To create datasets that are better representations of the wider population, the authors suggest collecting and reporting as many demographic variables as possible, with a suggested minimum set that includes age, sex and/or gender, race and ethnicity. Also, whenever feasible, raw imaging data should be collected and shared without institution-specific post-processing.
The second major issue with bias in AI is the lack of consensus on definitions of demographic groups. This is a challenge because many demographic categories, such as gender or race, are not biological variables but self-identified characteristics that can be informed by society or lived experiences.
The authors note a solution to this would be establishing more specificity with demographic terminologies that better align with societal norms and avoiding combining separate but related demographic categories, such as race and ethnicity or sex and gender.
The final major pitfall is the statistical evaluation of AI biases. At the root of this issue is establishing consensus on the definition of bias, which can have different clinical and technical meanings. In this article, bias is used in the context of demographic fairness and how it reflects differences in metrics between demographic groups.
Once a standard notion of bias is established, the incompatibility of fairness metrics needs to be addressed. Fairness metrics are tools that measure whether a machine learning model treats certain demographic groups differently. The authors stress that there is no universal fairness metric that can be applied to all cases and problems.
The authors suggest using standard and well accepted notions of demographic bias evaluations based on clinically relevant comparisons of AI model performances between demographic groups.
Additionally, they say that it is important to be mindful of the fact that different operating points of a predictive model will result in different performance, leading to potentially different demographic biases. Documentation of these operating points and thresholds should be included in research and by vendors who provide commercial AI products.
According to Dr. Yi, this work provides a roadmap for more consistent practices in measuring and addressing bias. This ensures that AI supports inclusive and equitable care for all people.
“AI offers an incredible opportunity to scale diagnostic capabilities in ways we’ve never seen before, potentially improving health outcomes for millions of people,” he said. “At the same time, if biases are left unchecked, AI could unintentionally worsen healthcare disparities.”
###
“Pitfalls and Best Practices in Evaluation of AI Algorithmic Biases in Radiology.” Collaborating with Dr. Yi were Preetham Bachina, B.S., Beepul Bharti, B.S., Sean P. Garin, B.S., Adway Kanhere, M.S.E., Pranav Kulkarni, B.S., David Li, M.D., Vishwa S. Parekh, Ph.D., Samantha M. Santomartino, B.A., Linda Moy, M.D., and Jeremias Sulam, Ph.D.
Radiology is edited by Linda Moy, M.D., New York University, New York, N.Y., and owned and published by the Radiological Society of North America, Inc. (https://pubs.rsna.org/journal/radiology)
RSNA is an association of radiologists, radiation oncologists, medical physicists and related scientists promoting excellence in patient care and health care delivery through education, research and technologic innovation. The Society is based in Oak Brook, Illinois. (RSNA.org)
For patient-friendly information on radiology, visit RadiologyInfo.org.
END
OAK BROOK, Ill. – A type of artificial intelligence called fine-tuned large language models (LLMs) greatly enhances error detection in radiology reports, according to a new study published today in Radiology, a journal of the Radiological Society of North America (RSNA). Researchers said the findings point to an important role for this technology in medical proofreading.
Radiology reports are crucial for optimal patient care. Their accuracy can be compromised by factors like errors in speech recognition software, variability ...
New research published in BioScience reveals that climate change is rapidly emerging as a third major threat to Earth's wild animals, joining habitat alteration and overexploitation in what scientists call a shift from "twin to triple threats."
The research team, led by William J. Ripple of Oregon State University, analyzed data for 70,814 animal species from 35 classes, using two publicly available biodiversity datasets to assess climate change vulnerability among the world's wild animal populations.
Their ...
A new liquid biopsy blood test could help detect cases of human papillomavirus (HPV)-associated head and neck cancers with significantly higher accuracy than currently used methods, including before patients develop symptoms, according to new Mass General Brigham research.
The researchers at Mass Eye and Ear, a member of the Mass General Brigham healthcare system, found that the blood-based diagnostic test they developed called HPV-DeepSeek achieved 99% sensitivity and 99% specificity for diagnosing cancer at the time of first clinical presentation, including ...
Aging is a multifaceted process driven by interconnected biological mechanisms, among which genomic instability and telomere attrition stand as primary hallmarks. Emerging research underscores the pivotal role of the human microbiome in modulating these processes, offering novel insights into aging and age-related diseases. This review synthesizes current evidence on how microbial dysbiosis accelerates aging by disrupting genomic integrity and telomere dynamics, while also exploring therapeutic strategies to promote healthy ...
Could AI that thinks more like a human be more sustainable than today’s LLMs? The AI industry is dominated by large companies with deep pockets and a gargantuan appetite for energy to power their models’ mammoth computing needs. Data centers supporting AI already account for up to 3.7% of global greenhouse emissions. In a Perspective, Alvaro Velasquez and colleagues propose an alternative model: neurosymbolic AI, which would require far less computing power, creating opportunities for smaller players to enter the field and allowing society to enjoy the benefits of AI without the environmental costs. Neurosymbolic AI is built on data-driven neural ...
A research paper by scientists at Hefei University of Technology presented an intuition-guided deep reinforcement learning framework for soft tissue manipulation under unknown constraints.
The research paper, published on Apr. 14, 2025 in the journal Cyborg and Bionic Systems.
Intraoperative soft tissue manipulation is a critical challenge in autonomous robotic surgery. Furthermore, the intricate in vivo environment surrounding the target soft tissues poses additional hindrances to autonomous robotic decision-making. Previous studies assumed the grasping point was known and the target deformation could be achieved. The constraints were assumed to be constant during the ...
A team of Mount Sinai surgeons has performed the first heart-liver-kidney triple organ transplants in New York. They successfully completed two of these complex surgeries on patients from Westchester County, who have since returned home and are making full recoveries.
Heart-liver-kidney transplants are extremely rare—only 58 have been done across the country since the United Network for Organ Sharing, the government agency that oversees transplantation, started tracking cases in 1987. The two procedures at The Mount Sinai Hospital, which took place on January 10 and March 8, were among only four to date in the ...
Sharks have been evolving for more than 450 million years, developing skeletons not from bone, but from a tough, mineralized form of cartilage. These creatures are more than just fast swimmers – they’re built for efficiency. Their spines act like natural springs, storing and releasing energy with each tailbeat, allowing them to move through the water with smooth, powerful grace.
Now, scientists are peering inside shark skeletons at the nanoscale, revealing a microscopic “sharkitecture” that helps these ancient apex predators withstand extreme physical demands of constant motion.
Using synchrotron X-ray nanotomography with detailed ...
Americans perceive small juries of content experts as the most legitimate moderators of potentially misleading content on social media, according to a survey, but perceive large, nationally representative or politically balanced juries with minimum knowledge qualifications as comparably legitimate. Social media content moderation policies tend to attract criticism, with some calling for more aggressive removal of harmful and misleading content and others decrying moderation as censorship and accusing expert moderators of being politically biased. Less clear is what the general public would like to see in terms of content ...
A Perspective proposes a pathway to improvements in sustainability of marine ecosystems and resources in China. Based on environmental accounting used in China’s terrestrial ecosystems, the approach would implement policy and governance to ensure accountability for sustainable use of marine systems. Laurence J. McCook and colleagues argue that the ecosystem goods and services provided to the nation by oceans and coastal ecosystems—including seagrass beds, salt marshes, coral reefs, and mangrove forests—are ...