(Press-News.org) (Toronto, November 17, 2025) A new study published in the peer-reviewed journal JMIR Mental Health by JMIR Publications highlights a critical risk in the growing use of Large Language Models (LLMs) like GPT-4o by researchers: the frequent fabrication and inaccuracy of bibliographic citations. The findings underscore an urgent need for rigorous human verification and institutional safeguards to protect research integrity, particularly in specialized and less publicly known fields within mental health.
Nearly 1 in 5 Citations Fabricated by GPT-4o in Literature Reviews
The article, titled "Influence of Topic Familiarity and Prompt Specificity on Citation Fabrication in Mental Health Research Using Large Language Models: Experimental Study," found that 19.9% of all citations generated by GPT-4o across six simulated literature reviews were entirely fabricated, meaning they could not be traced to any real publication. Furthermore, among the seemingly real citations, 45.4% contained bibliographic errors, most commonly incorrect or invalid Digital Object Identifiers (DOIs).
This timely research is highly relevant as academic journals have encountered instances of seemingly AI-hallucinated references in recent submissions. These bibliographic hallucinations and errors are not just formatting issues; they break the chain of verifiability, mislead readers, and fundamentally compromise the integrity and trustworthiness of scientific results and the cumulative knowledge base. This makes the need for careful scrutiny and verification paramount to safeguard academic rigor.
Reliability Varies by Topic Familiarity and Specificity
The research, conducted by a team including Jake Linardon, PhD, from Deakin University and his colleagues, systematically tested the reliability of GPT-4o's output across mental health topics with varying levels of public awareness and scientific maturity: major depressive disorder (high familiarity), binge eating disorder (moderate), and body dysmorphic disorder (low). They also tested general versus specialized review prompts (e.g., focusing on digital interventions).
Fabrication Risk is Highest for Less Familiar Topics: Fabrication rates were significantly higher for topics with lower public familiarity and research coverage, such as binge eating disorder (28%) and body dysmorphic disorder (29%), compared to major depressive disorder (6%).
Specialized Topics Pose a Higher Risk: While not universally true, stratified analysis showed that fabrication rates were significantly higher for specialized reviews (e.g., evidence for digital interventions) compared to general overviews for certain disorders, such as binge eating disorder.
Overall Inaccuracy is Pervasive: In total, nearly two-thirds of all citations generated by GPT-4o were either fabricated or contained errors, indicating a major reliability issue.
Urgent Call for Human Oversight and New Safeguards
The study’s conclusions issue a strong warning to the academic community: Citation fabrication and errors remain common in GPT-4o outputs. The authors stress that the reliability of LLM-generated citations is not fixed but is contingent on the topic and the way the prompt is designed.
Key Implications Highlighted in the Study:
Rigorous Verification is Mandatory: Researchers and students must subject all LLM-generated references to careful human verification to validate their accuracy and authenticity.
Journal and Institutional Role: Journal editors and publishers must implement stronger safeguards, potentially using detection software that flags citations that do not match existing sources, signaling a potential hallucination.
Policy and Training: Academic institutions must develop clear policies and training to equip users with the skills to critically assess LLM outputs and to design strategic prompts, especially when exploring less visible or highly specialized research topics.
Original article:
Linardon J, Jarman H, McClure Z, Anderson C, Liu C, Messer M. Influence of Topic Familiarity and Prompt Specificity on Citation Fabrication in Mental Health Research Using Large Language Models: Experimental Study. JMIR Ment Health 2025;12:e80371
URL: https://mental.jmir.org/2025/1/e80371
DOI: 10.2196/80371
About JMIR Publications
JMIR Publications is a leading open access publisher of digital health research and a champion of open science. With a focus on author advocacy and research amplification, JMIR Publications partners with researchers to advance their careers and maximize the impact of their work. As a technology organization with publishing at its core, we provide innovative tools and resources that go beyond traditional publishing, supporting researchers at every step of the dissemination process. Our portfolio features a range of peer-reviewed journals, including the renowned Journal of Medical Internet Research.
To learn more about JMIR Publications, please visit jmirpublications.com or connect with us via X, LinkedIn, YouTube, Facebook, and Instagram.
Head office: 130 Queens Quay East, Unit 1100, Toronto, ON, M5A 0P6 Canada
Media Contact:
Dennis O’Brien, Vice President, Communications & Partnerships
JMIR Publications
communications@jmir.org
The content of this communication is licensed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, published by JMIR Publications, is properly cited.
END
New study reveals high rates of fabricated and inaccurate citations in LLM-generated mental health research
2025-11-17
ELSE PRESS RELEASES FROM THIS DATE:
New 'heart percentile' calculator helps young adults grasp their long-term risk
2025-11-17
First tool to estimate percentiles of 30-year heart disease risk for adults ages 30–59
Aims to spark earlier prevention efforts amid rising diabetes and hypertension in young adults
Men showed the highest long-term risk in national analysis
Free online calculator is based on the American Heart Association’s PREVENT equations
CHICAGO --- Just as saving for retirement starts early, so should protecting your heart.
A new Northwestern Medicine study introduces a first-of-its-kind online calculator that uses percentiles to help younger adults forecast and understand their risk of a heart event over the next 30 years. ...
SwRI expands capabilities in large-scale heat exchanger testing
2025-11-17
SAN ANTONIO — November 17, 2025 — Southwest Research Institute (SwRI) has significantly expanded its heat exchanger performance evaluation capabilities with a new facility designed to industry standards, the Large-Scale Heat Exchanger Test Facility (LS-HXTF) that supports testing up to five megawatts of heat loads as well as a wider range of thermal performance testing.
Heat exchangers efficiently transfer heat between two or more fluids without mixing for a wide variety of heating and cooling applications. The ...
CRISPR breakthrough reverses chemotherapy resistance in lung cancer
2025-11-17
WILMINGTON, DEL. (November 14, 2025) – In a major step forward for cancer care, researchers at ChristianaCare’s Gene Editing Institute have shown that disabling the NRF2 gene with CRISPR technology can reverse chemotherapy resistance in lung cancer. The approach restores drug sensitivity and slows tumor growth. The findings appear today in the journal Molecular Therapy Oncology.
This breakthrough stems from more than a decade of research by the Gene Editing Institute into the NRF2 gene, a known driver of treatment resistance. The results were consistent across multiple in vitro studies using human lung cancer cell lines and in vivo animal models.
“We’ve ...
Study reveals potential and beauty of the world unseen
2025-11-17
A University of Otago – Ōtākou Whakaihu Waka-led study has produced a detailed blueprint of a bacteriophage, furthering their potential in the fight against drug-resistant bacteria.
Lead author Dr James Hodgkinson-Bean, who completed his PhD in the Department of Microbiology and Immunology, says bacteriophages are “extremely exciting” in the scientific world as researchers search for antibiotic alternatives to combat the increasing risk of antimicrobial resistance.
“Bacteriophage viruses are non-harmful to all multi-cellular life and able to ...
Duke-NUS study: Over 90% of older adults with dementia undergo burdensome interventions in their final year
2025-11-17
Singapore, 17 November 2025—A new study by researchers from Duke-NUS Medical School has revealed that almost all community-dwelling older adults with advanced dementia in Singapore experience at least one potentially burdensome intervention in their last year of life. The findings highlight an urgent need for new strategies to support families and reduce unnecessary interventions at the end of life.
Although the number of individuals living with dementia in the Asia-Pacific region is projected to rise to 71 million by ...
Not all PTSD therapies keep veterans in treatment, study warns
2025-11-17
About a quarter of U.S. service members and veterans who start psychotherapy for post-traumatic stress disorder quit before they finish treatment. But not all therapies are equal in their appeal, with some effective approaches reporting the highest dropout rates, according to research published by the American Psychological Association.
PTSD affects about 7% of veterans at some point in their lives, slightly higher than the rate seen in the general U.S. adult population, according to the U.S. Department of Veterans Affairs. Beyond PTSD’s emotional impact, the American Heart Association notes that it can also ...
New research shows how friends’ support protects intercultural couples
2025-11-17
New research examines how social approval from different sources predicts relationship quality for intercultural couples. Researchers found that having supportive friends can be a powerful protective factor, especially when they face disapproval from family or society more broadly.
The research, published in Social Psychological and Personality Science, advances research on intercultural relationships by drawing on a large sample of people in such relationships. This sample allowed researchers to study how social approval varies across cultural backgrounds, racial pairing, relationship length, and gender.
“The results highlight that friends and family can play distinct roles: for example, ...
FAU Engineering secures NIH grant to explore how the brain learns to ‘see’
2025-11-17
Vision is one of the most fundamental senses, shaping how we perceive, navigate and interact with the world around us. Yet for more than 12 million Americans living with visual impairments, even small deficits can profoundly impact daily life, limiting independence and overall quality of life.
Researchers have long recognized the potential of visual perceptual learning (VPL) – a process by which the brain improves its ability to detect subtle differences in visual stimuli, such as fine patterns or orientations – to enhance vision. VPL is already being explored ...
One of world’s most detailed virtual brain simulations is changing how we study the brain
2025-11-17
SEATTLE, WASH. — NOVEMBER 17, 2025 — Harnessing the muscle of one of the world’s fastest supercomputers, researchers have built one of the largest and most detailed biophysically realistic brain simulations of an animal ever. This virtual copy of a whole mouse cortex allows researchers to study the brain in a new way: simulating diseases like Alzheimer’s or epilepsy in the virtual world to watch in detail how damage spreads throughout neural networks or understanding cognition and consciousness. It simulates both form and function, with almost ten million neurons, 26 billion synapses, and 86 interconnected brain regions.
This spectacular achievement is the product ...
How early morning practices affect college athletes’ sleep
2025-11-17
COLUMBUS, Ohio – A study using more than 27,000 sleep records of collegiate athletes provides the best evidence to date that early morning team practices take a toll on healthy sleep.
Researchers at The Ohio State University used data from wearable sleep trackers to measure sleep for 359 varsity athletes over five years.
They found that when male athletes had team practices that began before 8 a.m., they averaged about 30 minutes less sleep the night before when compared to later morning workouts. Female athletes averaged about 20 minutes less sleep.
Findings also showed evidence that ...