(Press-News.org) Study found GPT-4-generated messages to patients were acceptable without any additional physician editing 58% of the time and provided more detailed educational information than those written by physicians
AI-generated messages had shortcomings, including 7% of responses being deemed unsafe if left unedited
Generative AI may promote efficiency and patient education, but require a “doctor in the loop” and a cautious approach as hospitals integrate algorithms into electronic health records
A new study by investigators from Mass General Brigham demonstrates that large language models (LLMs), a type of generative AI, may help reduce physician workload and improve patient education when used to draft replies to patient messages. The study also found limitations to LLMs that may affect patient safety, suggesting that vigilant oversight of LLM-generated communications is essential for safe usage. Findings, published in Lancet Digital Health, emphasize the need for a measured approach to LLM implementation.
Rising administrative and documentation responsibilities have contributed to increases in physician burnout. To help streamline and automate physician workflows, electronic health record (EHR) vendors have adopted generative AI algorithms to aid clinicians in drafting messages to patients; however, the efficiency, safety and clinical impact of their use had been unknown.
“Generative AI has the potential to provide a ‘best of both worlds’ scenario of reducing burden on the clinician and better educating the patient in the process,” said corresponding author Danielle Bitterman, MD, a faculty member in the Artificial Intelligence in Medicine (AIM) Program at Mass General Brigham and a physician in the Department of Radiation Oncology at Brigham and Women’s Hospital. “However, based on our team’s experience working with LLMs, we have concerns about the potential risks associated with integrating LLMs into messaging systems. With LLM-integration into EHRs becoming increasingly common, our goal in this study was to identify relevant benefits and shortcomings.”
For the study, the researchers used OpenAI’s GPT-4, a foundational LLM, to generate 100 scenarios about patients with cancer and an accompanying patient question. No questions from actual patients were used for the study. Six radiation oncologists manually responded to the queries; then, GPT-4 generated responses to the questions. Finally, the same radiation oncologists were provided with the LLM-generated responses for review and editing. The radiation oncologists did not know whether GPT-4 or a human had written the responses, and in 31% of cases, believed that an LLM-generated response had been written by a human.
On average, physician-drafted responses were shorter than the LLM-generated responses. GPT-4 tended to include more educational background for patients but was less directive in its instructions. The physicians reported that LLM-assistance improved their perceived efficiency and deemed the LLM-generated responses to be safe in 82.1 percent of cases and acceptable to send to a patient without any further editing in 58.3 percent of cases. The researchers also identified some shortcomings: If left unedited, 7.1 percent of LLM-generated responses could pose a risk to the patient and 0.6 percent of responses could pose a risk of death, most often because GPT-4’s response failed to urgently instruct the patient to seek immediate medical care.
Notably, LLM-generated/physician-edited responses were more similar in length and content to LLM-generated responses versus the manual responses. In many cases, physicians retained LLM-generated educational content, suggesting that they perceived it to be valuable. While this may promote patient education, the researchers emphasize that overreliance on LLMs may also pose risks, given their demonstrated shortcomings.
The emergence of AI tools in health has the potential to positively reshape the continuum of care and it is imperative to balance their innovative potential with a commitment to safety and quality. Mass General Brigham is leading the way in responsible use of AI, conducting rigorous research on new and emerging technologies to inform the incorporation of AI into care delivery, workforce support and administrative processes. Mass General Brigham is currently leading a pilot integrating generative AI into the electronic health record to draft replies to patient portal messages, testing the technology in a set of ambulatory practices across the health system.
Going forward, the study’s authors are investigating how patients perceive LLM-based communications and how patients’ racial and demographic characteristics influence LLM-generated responses, based on known algorithmic biases in LLMs.
“Keeping a human in the loop is an essential safety step when it comes to using AI in medicine, but it isn’t a single solution,” Bitterman said. “As providers rely more on LLMs, we could miss errors that could lead to patient harm. This study demonstrates the need for systems to monitor the quality of LLMs, training for clinicians to appropriately supervise LLM output, more AI literacy for both patients and clinicians, and on a fundamental level, a better understanding of how to address the errors that LLMs make.”
Authorship: Mass General Brigham co-authors include first author Shan Chen, MS, and Marco Guevara, Frank Hoebers, Benjamin Kann, Hugo Aerts and Raymond Mak of the AIM Program at Mass General Brigham and the Department of Radiation Oncology at Brigham and Women’s Hospital/Dana-Farber Cancer Institute, and Shalini Moningi, Hesham Elhalawani, Fallon Chipidza, and Jonathan Leeman (Brigham and Women’s Hospital). Additional co-authors include Timothy Miller, Guergana Savova, Jack Gallifant, Leo Celi, Maryam Lustberg, and Majid Afshar.
Disclosures: Bitterman is an Associate Editor of Radiation Oncology, HemOnc.org and receives funding from the American Association for Cancer Research. A complete list of disclosures is included in the paper.
Funding: Bitterman received financial support for this work from the National Institutes of Health (U54CA274516-01A1). Bitterman also received financial support from the Woods Foundation. A complete list of funding sources is included in the paper.
Paper cited: Chen, S et al. “The impact of using a large language model to respond to patient messages” Lancet Digital Health DOI: 10.1016/S2589-7500(24)00060-8/
###
About Mass General Brigham
Mass General Brigham is an integrated academic health care system, uniting great minds to solve the hardest problems in medicine for our communities and the world. Mass General Brigham connects a full continuum of care across a system of academic medical centers, community and specialty hospitals, a health insurance plan, physician networks, community health centers, home care, and long-term care services. Mass General Brigham is a nonprofit organization committed to patient care, research, teaching, and service to the community. In addition, Mass General Brigham is one of the nation’s leading biomedical research organizations with several Harvard Medical School teaching hospitals. For more information, please visit massgeneralbrigham.org.
END
Opioid use during pregnancy is not associated with a substantial increase in the risk of neuropsychiatric disorders such as ADHD in children, finds a large study from South Korea published by The BMJ today.
A slightly increased risk of neuropsychiatric disorders was found, but the researchers say this should not be considered clinically meaningful because it was limited to mothers exposed to more than one opioid prescription, high doses, and over longer time periods during pregnancy.
According to 2019 data from the Centers for Disease Control and Prevention, around 7% of women in the United States were prescribed opioids during pregnancy.
Previous ...
Public health experts are calling for a ban on alcohol industry funded education programmes in UK universities and schools, which they say normalise drinking and downplay the long term health risks of alcohol.
They include an industry-backed “freshers’ week survival guide” for university students and a theatre based educational programme in schools funded by Diageo, one of the world’s biggest alcoholic beverage companies, reports an investigation by The BMJ.
The call follows a successful campaign in Ireland that has led to educational programmes ...
Have you ever hailed a ride from an unrated Uber driver? Dined at a zero-star restaurant? Made a pricey online order from the lowest-rated Amazon vendor?
Likely not. That's because rating systems have overhauled the way we travel, eat and shop. Born from the early days of e-commerce on sites like eBay, ratings help weed out scammers and lend some semblance of order to a fast-changing online marketplace.
But there's a darker side to this reliance on ratings and rankings, says Marion Fourcade, a UC Berkeley sociology professor and director of Social Science Matrix. Supercharged ...
A new study led by the University of East Anglia (UEA) shows how firms in the United States behave differently depending on the political party in charge - even if they do not change policies.
The researchers, from UEA in the UK and Colorado School of Mines in the US, investigated the implications of changes in energy companies’ behaviour in response to the outcome of gubernatorial elections, which take place to elect state governors.
Using elections where the outcome is very close to see how unpredictable changes in the ruling party affect things, they focused on the behaviour of ...
(Santa Barbara, Calif.) — Just as water moves through a river, rivers themselves move across the landscape. They carve valleys and canyons, create floodplains and deltas, and transport sediment from the uplands to the ocean.
A new paper out of UC Santa Barbara presents an account of what drives the migration rates of meandering rivers. The two authors compiled a global dataset of these waterways, analyzing how vegetation and sediment load effect channel movement. “We find a global-scale trend between the amount of sediment that rivers ...
Improved climate conditions in Europe for a range of climate-sensitive infectious diseases increase the risk of local transmission.
Researchers are fighting back with early warning systems that combine mosquito surveillance with climate forecasts to give local communities time to prepare and protect themselves.
**ECCMID has now changed its name to ESCMID Global, please credit ESCMID Global Congress (formerly ECCMID, Barcelona, Spain, 27-30 April) in all future stories**
The geographical range of vector-borne diseases, especially diseases that ...
*Please mention the ESCMID Global Congress (formerly ECCMID) Barcelona, 27-30 April) if using this material*
Climate change is multiplying the threat caused by antimicrobial resistance (AMR), amplifying its growing risk through increasing global temperatures, greenhouse gas emissions and rising sea levels. This warning along will be given in a new evidence review at this year’s ESCMID Global Congress (formerly ECCMID) (27-30 April), by Professor Sabiha Essack, South African Research Chair in Antibiotic Resistance and One ...
*Embargo 0001H CEST Barcelona time Thursday 25 April*
Concerns about the common side-effects of COVID-19 vaccines and their effectiveness are key to determining whether adults in Germany and the UK choose to get vaccinated against the virus, according to new research being presented at this year’s ESCMID Global Congress (formerly ECCMID) in Barcelona, Spain (27-30 April).
In contrast, timing of COVID-19 and influenza vaccines, as well as their type, have little influence on people’s willingness to get vaccinated in both countries.
The ...
**ECCMID has now changed its name to ESCMID Global, please credit ESCMID Global Congress (formerly ECCMID, Barcelona, Spain, 27-30 April) in all future stories**
Experts are working on a new type of ultraviolet light called far-UVC that could be highly effective for reducing air-borne transmission of diseases such as COVID-19 and influenza, as well as surface contamination in hospitals, with hopes that it could even reduce the risk of the next pandemic.
In a new research review presented at this year’s ...
Meta-analysis of genetic studies from 10 countries finds infants born by C-section have more antibiotic resistance genes; antibiotic use and prematurity also fuel resistance.
Infants living in Africa had more antibiotic resistant genes than those from Europe.
Findings indicate that interventions targeting the gut microbiome of mothers and their infants, such as probiotics, could help reduce antibiotic resistance spread.
**ECCMID has now changed name to ESCMID Global, please credit ESCMID Global Congress in all future stories**
A meta-analysis of genetic studies analysing the microbiota (bacteria in the gut) of 1,275 infants from 10 countries finds that caesarean delivery and antibiotic ...