Study assesses GPT-4’s potential to perpetuate racial, gender biases in clinical decision making
A team of Brigham researchers analyzed GPT-4’s performance in four clinical decision support scenarios: generating clinical vignettes, diagnostic reasoning, clinical plan generation and subjective patient assessments.
When prompted to generate clinical vignettes for medical education, GPT-4 failed to model the demographic diversity of medical conditions, exaggerating known demographic prevalence differences in 89% of diseases.
When evaluating patient perception, GPT-4 produced significantly different responses by gender or race/ethnicity for 23% of cases.
Large language models (LLMs) like ChatGPT and GPT-4 have the potential to assist in ...














