PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

Study assesses GPT-4’s potential to perpetuate racial, gender biases in clinical decision making

A team of Brigham researchers analyzed GPT-4’s performance in four clinical decision support scenarios: generating clinical vignettes, diagnostic reasoning, clinical plan generation and subjective patient assessments.

2023-12-18
(Press-News.org) A team of Brigham researchers analyzed GPT-4’s performance in four clinical decision support scenarios: generating clinical vignettes, diagnostic reasoning, clinical plan generation and subjective patient assessments. 

When prompted to generate clinical vignettes for medical education, GPT-4 failed to model the demographic diversity of medical conditions, exaggerating known demographic prevalence differences in 89% of diseases. 

When evaluating patient perception, GPT-4 produced significantly different responses by gender or race/ethnicity for 23% of cases. 

Large language models (LLMs) like ChatGPT and GPT-4 have the potential to assist in clinical practice to automate administrative tasks, draft clinical notes, communicate with patients, and even support clinical decision making. However, preliminary studies suggest the models can encode and perpetuate social biases that could adversely affect historically marginalized groups. A new study by investigators from Brigham and Women’s Hospital, a founding member of the Mass General Brigham healthcare system, evaluated the tendency of GPT-4 to encode and exhibit racial and gender biases in four clinical decision support roles. Their results are published in The Lancet Digital Health. 

“While most of the focus is on using LLMs for documentation or administrative tasks, there is also excitement about the potential to use LLMs to support clinical decision making,” said corresponding author Emily Alsentzer, PhD, a postdoctoral researcher in the Division of General Internal Medicine at Brigham and Women's Hospital. “We wanted to systematically assess whether GPT-4 encodes racial and gender biases that impact its ability to support clinical decision making." 

Alsentzer and colleagues tested four applications of GPT-4 using the Azure OpenAI platform. First, they prompted GPT-4 to generate patient vignettes that can be used in medical education. Next, they tested GPT-4's ability to correctly develop a differential diagnosis and treatment plan for 19 different patient cases from a NEJM Healer, a medical education tool that presents challenging clinical cases to medical trainees. Finally, they assessed how GPT-4 makes inferences about a patient’s clinical presentation using eight case vignettes that were originally generated to measure implicit bias. For each application, the authors assessed whether GPT-4’s outputs were biased by race or gender.  

For the medical education task, the researchers constructed ten prompts that required GPT-4 to generate a patient presentation for a supplied diagnosis. They ran each prompt 100 times and found that GPT-4 exaggerated known differences in disease prevalence by demographic group.  

"One striking example is when GPT-4 is prompted to generate a vignette for a patient with sarcoidosis: GPT-4 describes a Black woman 81% of the time," Alsentzer explains. "While sarcoidosis is more prevalent in Black patients and in women, it’s not 81% of all patients."  

Next, when GPT-4 was prompted to develop a list of 10 possible diagnoses for the NEJM Healer cases, changing the gender or race/ethnicity of the patient significantly affected its ability to prioritize the correct top diagnosis in 37% of cases.  

"In some cases, GPT-4’s decision making reflects known gender and racial biases in the literature," Alsentzer said. "In the case of pulmonary embolism, the model ranked panic attack/anxiety as a more likely diagnosis for women than men. It also ranked sexually transmitted diseases, such as acute HIV and syphilis, as more likely for patients from racial minority backgrounds compared to white patients." 

 
When asked to evaluate subjective patient traits such as honesty, understanding, and pain tolerance, GPT-4 produced significantly different responses by race, ethnicity, and gender for 23% of the questions. For example, GPT-4 was significantly more likely to rate Black male patients as abusing the opioid Percocet than Asian, Black, Hispanic, and white female patients when the answers should have been identical for all the simulated patient cases. 

Limitations of the current study include testing GPT-4's responses using a limited number of simulated prompts and analyzing model performance using only a few traditional categories of demographic identities. Future work should investigate biases using clinical notes from the electronic health record. 

"While LLM-based tools are currently being deployed with a clinician in the loop to verify the model’s outputs, it is very challenging for clinicians to detect systemic biases when viewing individual patient cases," Alsentzer said. “It is critical that we perform bias evaluations for each intended use of LLMs, just as we do for other machine learning models in the medical domain. Our work can help start a conversation about GPT-4’s potential to propagate bias in clinical decision support applications.” 

Authorship: Additional BWH authors include Jorge A Rodriguez, David W Bates, and Raja-Elie E Abdulnour. Additional authors include Travis Zack, Eric Lehman, Mirac Suzgun, Leo Anthony Celi, Judy Gichoya, Dan Jurafsky, Peter Szolovits, and Atul J Butte. 

Disclosures: Alsentzer reports personal fees from Canopy Innovations, Fourier Health, and Xyla; and grants from Microsoft Research. Abdulnour is an employee of Massachusetts Medical Society, which owns NEJM Healer (NEJM Healer cases were used in the study). Additional author disclosures can be found in the paper. 

Funding: T32 NCI Hematology/Oncology Training Fellowship; Open Philanthropy and the National Science Foundation (IIS-2128145); and a philanthropic gift from Priscilla Chan and Mark Zuckerberg. 

Paper cited: Zack, T; Lehman, E et al. “Assessing the potential of GPT-4 to perpetuate racial and gender biases in healthcare: a model evaluation study” The Lancet Digital Health DOI: 10.1016/S2589-7500(23)00225-X 

 

END


ELSE PRESS RELEASES FROM THIS DATE:

Apes remember friends they haven’t seen for decades

Apes remember friends they haven’t seen for decades
2023-12-18
Apes recognize photos of groupmates they haven’t seen for more than 25 years and respond even more enthusiastically to pictures of their friends, a new study finds. The work, which demonstrates the longest-lasting social memory ever documented outside of humans, and underscores how human culture evolved from the common ancestors we share with apes, our closest relatives, was published today in the journal Proceedings of the National Academy of Sciences. “Chimpanzees and bonobos recognize individuals even though they haven’t seen them for ...

Scientists might be using a flawed strategy to predict how species will fare under climate change

Scientists might be using a flawed strategy to predict how species will fare under climate change
2023-12-18
EMBARGO LIFTS DEC. 18, 2023, AT 3:00 PM U.S. EASTERN TIME As the world heats up, and the climate shifts, life will migrate, adapt or go extinct. For decades, scientists have deployed a specific method to predict how a species will fare during this time of great change. But according to new research, that method might be producing results that are misleading or wrong. University of Arizona researchers and their team members at the U.S. Forest Service and Brown University found that the method – commonly referred to as space-for-time substitution – failed to accurately predict how a widespread tree of the Western U.S. called the ...

Mesopotamian bricks unveil the strength of Earth’s ancient magnetic field

Mesopotamian bricks unveil the strength of Earth’s ancient magnetic field
2023-12-18
Ancient bricks inscribed with the names of Mesopotamian kings have yielded important insights into a mysterious anomaly in Earth’s magnetic field 3,000 years ago, according to a new study involving UCL researchers. The research, published in the Proceedings of the National Academy of Sciences (PNAS), describes how changes in the Earth’s magnetic field imprinted on iron oxide grains within ancient clay bricks, and how scientists were able to reconstruct these changes from the names of the kings inscribed on the bricks. The team hopes that using this “archaeomagnetism,” which looks for signatures ...

Move over dolphins. Chimps and bonobos can recognize long-lost friends and family — for decades

2023-12-18
Researchers led by a University of California, Berkeley, comparative psychologist have found that great apes and chimpanzees, our closest living relatives, can recognize groupmates they haven't seen in over two decades — evidence of what’s believed to be the longest-lasting nonhuman memory ever recorded.  The findings also bolster the theory that long-term memory in humans, chimpanzees and bonobos likely comes from our shared common ancestor that lived between 6 million and 9 million years ago. The team used infrared eye-tracking cameras to record where bonobos and chimps gazed when they were shown side-by-side images of other bonobos ...

First observation of how water molecules move near a metal electrode

First observation of how water molecules move near a metal electrode
2023-12-18
A collaborative team of experimental and computational physical chemists from South Korea and the United States have made an important discovery in the field of electrochemistry, shedding light on the movement of water molecules near metal electrodes. This research holds profound implications for the advancement of next-generation batteries utilizing aqueous electrolytes. In the nanoscale realm, chemists typically utilize laser light to illuminate molecules and measure spectroscopic properties to visualize molecules. However, studying the behavior of ...

Harnessing nanotechnology to understand tumor behavior

Harnessing nanotechnology to understand tumor behavior
2023-12-18
A study conducted by pre-PhD researcher Pablo S. Valera and recently published in PNAS demonstrates the potential of surface-enhanced Raman spectroscopy (SERS) to explore metabolites secreted by cancer cells in cancer research. The study, which has been led by Ikerbasque Research Professors Luis Liz-Marzán (from CIC biomaGUNE) and Arkaitz Carracedo (of CIC bioGUNE) and in which other researchers from both centers, also members of the Networking Biomedical Research Centre (CIBER), have participated as well, provides valuable information to guide more specific experiments to reveal ...

Exercise-induced Pgc-1α expression inhibits fat accumulation in aged skeletal muscles

2023-12-18
Myosteatosis, or aging-related fat accumulation in skeletal muscles, is a leading cause of declines in muscle strength and quality of life in elderly adults. Older adults who are sedentary and develop accumulated fat in the skeletal muscle are often prescribed exercise by their doctors to combat the condition. If scientists were to develop a new therapy, such as medications, to combat myosteatosis, they would need to replicate the mechanism by which exercise might reduce fat accumulation in muscles.    Fibro-adipogenic ...

NASA’s Webb rings in holidays with ringed planet Uranus

NASA’s Webb rings in holidays with ringed planet Uranus
2023-12-18
NASA’s James Webb Space Telescope recently trained its sights on unusual and enigmatic Uranus, an ice giant that spins on its side. Webb captured this dynamic world with rings, moons, storms, and other atmospheric features – including a seasonal polar cap. The image expands upon a two-color version released earlier this year, adding additional wavelength coverage for a more detailed look. With its exquisite sensitivity, Webb captured Uranus’ dim inner and outer rings, including the ...

Memory research: Breathing in sleep impacts memory processes

2023-12-18
How are memories consolidated during sleep? In 2021, researchers led by Dr. Thomas Schreiner, leader of the Emmy Noether junior research group at LMU’s Department of Psychology, had already shown there was a direct relationship between the emergence of certain sleep-related brain activity patterns and the reactivation of memory contents during sleep. However, it was still unclear whether these rhythms are orchestrated by a central pacemaker. So the researchers joined up with scientists from the Max Planck Institute for Human Development in Berlin and the University of Oxford to reanalyze the data. Their results have identified ...

Alexander Zholents recognized with 2023 Dieter Möhl Award

2023-12-18
Zholents was honored for his work on the theory of optical stochastic cooling. Alexander Zholents, a senior physicist at the U.S. Department of Energy’s (DOE) Argonne National Laboratory and distinguished fellow in the Accelerator Systems division is one of the recipients of this year’s Dieter Möhl Award. The award is presented by CERN, the European laboratory for particle physics. It is in tribute to the late Dieter Möhl, a pioneer in the realm of particle beam cooling. The awards celebrate both early career and lifetime achievements in the field of beam cooling and its applications. “I am deeply honored to receive this award,” said Zholents. ​“The ...

LAST 30 PRESS RELEASES:

ASPB welcomes Hong Ma as Society President

Can advanced AI can solve visual puzzles and perform abstract reasoning?

West Health-Gallup poll: Healthcare may be sleeper issue in U.S. presidential campaign

UC Irvine scientists track and analyze lofted embers that cause spot fires

Uncovering pandemic inequities

Microbiome researcher awarded NIH Transformative Research Award to pursue personalized treatment for gut diseases

Teresa Bowman, Ph.D., named Chair of Developmental & Molecular Biology at Albert Einstein College of Medicine

Legal system fails to protect people from malicious copyright cases at the cost of sexual privacy, study warns

Ancient climate analysis reveals unknown global processes

Gene therapy shows long-term benefit for patients with a rare pediatric brain disease

Do people with MS have an increased risk of cancer?

New research on octopus-inspired technology successfully maneuvers underwater objects

Newly discovered Late Cretaceous birds may have carried heavy prey like extant raptors

Bat species richness in San Diego, C.A. decreases as artificial lights, urbanization, and unconserved land increase, with Townsend's big-eared bat especially affected

Satellite data shows massive bombs dropped in dangerous proximity to Gaza Strip hospitals in 2023

Predatory birds from the same fossil formation as SUE the T. rex

Sexist textbooks? Review of over 1200 English-language textbooks from 34 countries reveals persistent pattern of stereotypical gender roles and under-representation of female characters across countri

Interview with Lee Crawfurd, Center for Global Development, United Kingdom

Scientists show accelerating CO2 release from rocks in Arctic Canada with global warming

The changing geography of “energy poverty”

Why people think they’re right, even when they are wrong

New study shows how muscle energy production is impaired in type 2 diabetes

Early human species benefited from food diversity in steep mountainous terrain

Researchers discover new insights into bacterial photosynthesis

Former United States Air Force surgeon general to lead Military Health Institute at UT Health San Antonio

Journal of Nutrition Education and Behavior announces 2024 Best Article, Best Research Brief, and GEM Awards

NYU Tandon School of Engineering study maps pedestrian crosswalks across entire cities, helping improve road safety and increase walkability

Louis V. Gerstner, Jr. family donates $25 million to establish Gerstner Scholars Program in AI Translation at Mayo Clinic

UTIA entomologist elected president of SIP

Rice bioengineers awarded $3.4M for project to end polio

[Press-News.org] Study assesses GPT-4’s potential to perpetuate racial, gender biases in clinical decision making
A team of Brigham researchers analyzed GPT-4’s performance in four clinical decision support scenarios: generating clinical vignettes, diagnostic reasoning, clinical plan generation and subjective patient assessments.