(Press-News.org) OAK BROOK, Ill. – Use of publicly available large language models (LLMs) resulted in changes in breast imaging reports classification that could have a negative effect on patient management, according to a new international study published today in the journal Radiology, a journal of the Radiological Society of North America (RSNA). The study findings underscore the need to regulate these LLMs in scenarios that require high-level medical reasoning, researchers said.
LLMs are a type of artificial intelligence (AI) widely used today for a variety of purposes. In radiology, LLMs have already been tested in a wide variety of clinical tasks, from processing radiology request forms to providing imaging recommendations and diagnosis support.
Publicly available generic LLMs like ChatGPT (GPT 3.5 and GPT-4) and Google Gemini (formerly Bard) have shown promising results in some tasks. Importantly, however, they are less successful at more complex tasks requiring a higher level of reasoning and deeper clinical knowledge, such as providing imaging recommendations. Users seeking medical advice may not always understand the limitations of these untrained programs.
“Evaluating the abilities of generic LLMs remains important as these tools are the most readily available and may unjustifiably be used by both patients and non-radiologist physicians seeking a second opinion,” said study co-lead author Andrea Cozzi, M.D., Ph.D., radiology resident and post-doctoral research fellow at the Imaging Institute of Southern Switzerland, Ente Ospedaliero Cantonale, in Lugano, Switzerland.
Dr. Cozzi and colleagues set out to test the generic LLMs on a task that pertains to daily clinical routine but where the depth of medical reasoning is high and where the use of languages other than English would further stress LLMs capabilities. They focused on the agreement between human readers and LLMs for the assignment of Breast Imaging Reporting and Data System (BI-RADS) categories, a widely used system to describe and classify breast lesions.
The Swiss researchers partnered with an American team from Memorial Sloan Kettering Cancer Center in New York City and a Dutch team at the Netherlands Cancer Institute in Amsterdam.
The study included BI-RADS classifications of 2,400 breast imaging reports written in English, Italian and Dutch. Three LLMs—GPT-3.5, GPT-4 and Google Bard (now renamed Google Gemini)—assigned BI-RADS categories using only the findings described by the original radiologists. The researchers then compared the performance of the LLMs with that of board-certified breast radiologists.
The agreement for BI-RADS category assignments between human readers was almost perfect. However, the agreement between humans and the LLMs was only moderate. Most importantly, the researchers also observed a high percentage of discordant category assignments that would result in negative changes in patient management. This raises several concerns about the potential consequences of placing too much reliance on these widely available LLMs.
According to Dr. Cozzi, the results highlight the need for regulation of LLMs when there is a highly likely possibility that users may ask them health-care-related questions of varying depth and complexity.
“The results of this study add to the growing body of evidence that reminds us of the need to carefully understand and highlight the pros and cons of LLM use in health care,” he said. “These programs can be a wonderful tool for many tasks but should be used wisely. Patients need to be aware of the intrinsic shortcomings of these tools, and that they may receive incomplete or even utterly wrong replies to complex questions.”
The Swiss researchers were supervised by the co-senior author Simone Schiaffino, M.D. The American team was led by the co-first author Katja Pinker, M.D., Ph.D., and the Dutch team was led by the co-senior author Ritse M. Mann, M.D., Ph.D.
###
“BI-RADS Category Assignments by GPT-3.5, GPT-4, and Google Bard: A Multilanguage Study.” Collaborating with Drs. Cozzi, Pinker, Schiaffino and Mann were Andri Hidber, B.Med., Tianyu Zhang, Ph.D., Luca Bonomo, M.D., Roberto Lo Gullo, M.D., Blake Christianson, M.D., Marco Curti, M.D., Stefania Rizzo, M.D., Ph.D., and Filippo Del Grande, M.D., M.B.A., M.H.E.M.
Radiology is edited by Linda Moy, M.D., New York University, New York, N.Y., and owned and published by the Radiological Society of North America, Inc. (https://pubs.rsna.org/journal/radiology)
RSNA is an association of radiologists, radiation oncologists, medical physicists and related scientists promoting excellence in patient care and health care delivery through education, research and technologic innovation. The Society is based in Oak Brook, Illinois. (RSNA.org)
For patient-friendly information on breast imaging, visit RadiologyInfo.org.
END
OAK BROOK, Ill. – Smokers who have small abnormalities on their CT scans that grow over time have a greater likelihood of experiencing acute respiratory disease events, according to a new study published today in Radiology, a journal of the Radiological Society of North America (RSNA).
Quantitative interstitial abnormalities (QIA) are subtle abnormalities on chest CTs that do not meet the diagnostic criteria for advanced pulmonary diseases but are nonetheless associated with decreased ...
Engineers in Australia have found a new way to make power-pole insulators resistant to fire and electrical sparking, promising to prevent dangerous pole-top fires and reduce blackouts.
Pole-top fires pose significant challenges to power providers and communities worldwide. In March, pole-top fires cut power from 40,000 homes and businesses in Perth.
The 2020 Royal Commission into National Natural Disaster Arrangements found that power outages experienced by 280,000 customers from various energy providers during Black Summer fires were mainly triggered by events involving insulators ...
A team of astronomers and citizen scientists has discovered a planet in the habitable zone of an unusual star system, including two stars and potentially another exoplanet.
The planet hunters spotted the Neptune-like planet as it crossed in front of its host star, temporarily dimming the star’s light in a way akin to a solar eclipse on Earth. This ‘transit method’ usually identifies planets with tight orbits, as they are more likely to follow paths that put them between Earth and their host star and, when following such paths, move into light-blocking positions more frequently. That’s why this newly discovered planet is ...
WASHINGTON, DC (April 30, 2024)—Tambourine Philanthropies (Tambourine), in partnership with the Milken Institute Science Philanthropy Accelerator for Research and Collaboration (SPARC), is pleased to announce the recipients of its ALS Breakthrough Research Fund. Tambourine has committed over $5 million total to eight teams around the world for basic and discovery-focused research aiming to change how we understand and treat amyotrophic lateral sclerosis (ALS).
Tambourine launched the ALS Breakthrough Research ...
Electric bicycle rebates have exploded in popularity in North America as transportation planners try to get people out of their cars and into healthier, more climate-friendly alternatives. However, there is limited understanding of the full impacts of these incentives.
Are new cycling habits sustainable? Who benefits most from these incentives? And are they worth the cost?
Researchers at UBC’s Research on Active Transportation (REACT) Lab have some answers. They surveyed participants in an e-bike incentive program offered by the District of ...
With the help of a form of machine learning called deep reinforcement learning (DRL), the EPFL robot notably learned to transition from trotting to pronking – a leaping, arch-backed gait used by animals like springbok and gazelles – to navigate a challenging terrain with gaps ranging from 14-30cm. The study, led by the BioRobotics Laboratory in EPFL’s School of Engineering, offers new insights into why and how such gait transitions occur in animals.
“Previous research has introduced energy efficiency and musculoskeletal injury avoidance as the two main explanations ...
A novel oral amphotericin B (MAT2203) developed by Matinas BioPharma for treatment of invasive mucormycosis (IM) and other deadly invasive fungal infections, has demonstrated encouraging results in a series of preclinical studies. The groundbreaking research, led by Lundquist Institute (TLI) Investigator Ashraf Ibrahim, PhD, has been published in the journal Antimicrobial Agents and Chemotherapy.
The studies focused on MAT2203, an oral lipid nanocrystal formulation of amphotericin B, which has previously demonstrated safety and effectiveness in the clinical treatment of various fungal infections. The research aimed ...
Brain-computer interfaces (BCIs) have the potential to make life easier for people with motor or speech disorders, allowing them to manipulate prosthetic limbs and employ computers, among other uses. In addition, healthy and impaired people alike could enjoy BCI-based gaming. Non-invasive BCIs that work by analyzing brain waves recorded through electroencephalography are currently limited by inconsistent performance. Bin He and colleagues used deep-learning decoders to improve a BCI’s performance responding to ...
Background and objectives
Parkinson’s disease (PD) is a common neurodegenerative disorder with unclear molecular mechanisms. Noncoding RNAs, such as microRNAs (miRNAs) and long noncoding RNAs (lncRNAs), have been identified as critical regulators of gene expression. This study aimed to investigate the triple network of lncRNA-miRNA-mRNA, known as competing endogenous RNAs (ceRNAs), and to identify essential lncRNAs that regulate PD-related gene expression through their target miRNAs. The study also identified a common triple network between COVID-19 and PD that may contribute to exacerbating PD symptoms.
Methods
A bioinformatics approach was employed to construct ...
The American College of Lifestyle Medicine (ACLM) has developed a clinical tool to help health care professionals incorporate a food as medicine approach into their practice by assessing and tracking the proportion of whole, unrefined plant-based foods and water intake in their patients’ dietary patterns.
The ACLM Diet Screener, a 27-item diet assessment tool available free on ACLM’s website, was designed to guide clinical conversations around diet and support nutrition prescriptions, while also being brief enough for use during routine ...