(Press-News.org) A team of Mass General Brigham researchers has developed one of the first fully autonomous artificial intelligence (AI) systems capable of screening for cognitive impairment using routine clinical documentation. The system, which requires no human intervention or prompting after deployment, achieved 98% specificity in real-world validation testing. Results are published in npj Digital Medicine.
Alongside the publication, the team is releasing Pythia, an open-source tool that enables any healthcare system or research institution to deploy autonomous prompt optimization for their own AI screening applications.
"We didn't build a single AI model — we built a digital clinical team," said corresponding author Hossein Estiri, PhD, director of the Clinical Augmented Intelligence (CLAI) research group and associate professor of medicine at Massachusetts General Hospital, a founding member of the Mass General Brigham healthcare system. "This AI system includes five specialized agents that critique each other and refine their reasoning, just like clinicians would in a case conference.”
Cognitive impairment remains significantly underdiagnosed in routine clinical care, and traditional screening tools and cognitive tests are highly resource-intensive to administer and difficult for patients to access. Yet early detection has become increasingly critical, especially with the recent approval of Alzheimer’s disease therapies that are most effective when administered early in the disease.
“By the time many patients receive a formal diagnosis, the optimal treatment window may have closed,” said co-lead study author Lidia Moura, MD, PhD, MPH, director of Population Health and the Center for Healthcare Intelligence in the Department of Neurology at Mass General Brigham MGB Neurology Department.
To better capture at-risk patients, the Mass General Brigham team developed an AI system that runs on an open-weight large language model that can be deployed locally within hospital information technology infrastructure. It employs five agents that each serve different functions and work collaboratively to make clinical determinations and refine them to address errors and improve sensitivity and specificity.
These agents operate autonomously in an iterative loop, refining their detection capabilities through structured collaboration until performance targets are met or the system determines it has converged. No patient data are transmitted to external servers or cloud-based AI services.
The study analyzed more than 3,300 clinical notes from 200 anonymized patients at Mass General Brigham. By analyzing clinical notes produced during regular healthcare visits, this innovative system can turn everyday documentation into a chance to screen for cognitive issues, helping identify patients who might need a formal assessment.
“Clinical notes contain whispers of cognitive decline that busy clinicians can’t systematically surface,” said Moura. “This system listens at scale.”
When the AI system and human reviewers disagreed, an independent expert re-evaluated each case. Among the disagreement cases, the expert validated the AI's reasoning in 58% of the time -- meaning the system was often making sound clinical judgments that initial human review had missed.
"We expected to find AI errors. Instead, we often found the AI was making defensible judgments based on the evidence in the notes," said Estiri.
Analysis of cases in which the AI was incorrect revealed systematic patterns: documentation limitations where cognitive concerns appeared only in problem lists without supporting narrative, and domain knowledge gaps where the system failed to recognize certain clinical indicators. The system excelled with comprehensive clinical narratives but struggled with isolated data lacking context.
Although the system achieved 91% sensitivity under balanced testing, its sensitivity decreased to 62% under real-world conditions (with a prevalence of 33% positive cases), while specificity remained high at 98%. The researchers reported these calibration challenges to provide transparency and guide future efforts to improve clinical reliability.
"We're publishing exactly the areas in which AI struggles," said Estiri. "The field needs to stop hiding these calibration challenges if we want clinical AI to be trusted."
Authorship: In addition to Estiri and Moura, Mass General Brigham and Harvard Medical School co-authors include Jiazi Tian, Pedram Fard, Cameron Cagan, Neguine Rezaii, Rebeka Bustamante Rocha, Liqin Wang, Valdery Moura Junior, Deborah Blacker, Jennifer S. Haas, Chirag Patel, and Shawn N. Murphy.
Disclosures: The authors declare no competing interests.
Funding: This research was funded by the National Institutes of Health (NIH): the National Institute on Aging (grants RF1AG074372, R01AG074372, R01AG082693), and the National Institute of Allergy and Infectious Diseases (R01AI165535).
Paper cited: Tian, et al. “An autonomous agentic workflow for clinical detection of cognitive concerns using large language models.” npj Digital Medicine, DOI: 10.1038/s41746-025-02324-4.
###
About Mass General Brigham
Mass General Brigham is an integrated academic health care system, uniting great minds to solve the hardest problems in medicine for our communities and the world. Mass General Brigham connects a full continuum of care across a system of academic medical centers, community and specialty hospitals, a health insurance plan, physician networks, community health centers, home care, and long-term care services. Mass General Brigham is a nonprofit organization committed to patient care, research, teaching, and service to the community. In addition, Mass General Brigham is one of the nation’s leading biomedical research organizations with several Harvard Medical School teaching hospitals. For more information, please visit massgeneralbrigham.org.
END
For the first time, a study from researchers at Scripps Institution of Oceanography at the University of California San Diego integrates climate-related damages to the ocean into the social cost of carbon— a measure of economic harm caused by greenhouse gas emissions.
When ocean damage from climate change, dubbed the “blue” social cost of carbon, is calculated, the study finds that the global cost of carbon dioxide emissions to society nearly doubles. Until now, the ocean was largely overlooked in the standard accounting of the social cost ...
As climate change drives more frequent and severe wildfires across boreal forests in Alaska and northwestern Canada, scientists are asking a critical question: Will these ecosystems continue to store carbon or become a growing source of carbon emissions? New research published this week shows that when forests shift from coniferous—consisting mostly of pines, spruces and larches—to deciduous—consisting mostly of birches and aspens—they could release substantially less carbon when they burn.
The National Science Foundation-funded study, led by researchers from the Center for Ecosystem Science and Society (ECOSS) at Northern Arizona University and published in ...
Barcelona, January 15, 2026. A multidisciplinary team has uncovered a key mechanism that allows the human bacterium Mycoplasma pneumoniae—responsible for atypical pneumonia and other respiratory infections—to obtain cholesterol and other essential lipids directly from the human body. The discovery, published in Nature Communications, was co-led by Dr. Noemí Rotllan, from the Sant Pau Research Institute (IR Sant Pau) and the Center for Biomedical Research in Diabetes and Associated Metabolic Disorders (CIBERDEM); Dr. Marina Marcos, from the Autonomous University of Barcelona (UAB); and Dr. David Vizarraga, from the Institute of Molecular Biology ...
“They’re very destructive when there's a lot of them, but one-on-one, what's not to love?” says Arianne Cease. She’s talking about locusts.
As the director of Arizona State University’s Global Locust Initiative, Cease has a healthy admiration for these insects, even as she studies ways to manage locust swarms and prevent the destruction they cause.
Locust swarms, which may conjure images of biblical plagues and ancient famines, remain a serious problem worldwide. They can destroy crops across entire regions, ruin people’s livelihoods, ...
Statement Highlights:
More than one-third of adults and children in the U.S. are living with obesity. Obesity rates are highest among non-Hispanic Black children and adults, low-income families, people living in rural areas and adults with a high school education or less.
The latest research indicates that barriers to maintaining a healthy weight or participating in weight management programs include: limited access to healthy foods, lack of time to prepare meals and engage in regular physical activity, ...
Women and people with anxiety are both prone to low confidence in their own abilities, but a new study by University College London (UCL) researchers has found that the two groups are prone to two distinct types of underconfidence.
When they took more time to reflect on their answers in a simple experimental task, people with anxiety grew less confident in their answers, while women who were underconfident gained confidence.
Lead author, Dr Sucharit Katyal, who completed the work as a postdoctoral researcher at the Max Planck UCL Centre ...
Insects are often seen as invaders due to high-profile species like the yellow-legged (Asian) hornet, the harlequin ladybird and fire ant. but new research reveals insects are also major victims of invasive alien species – exacerbating population declines and reducing their ability to provide vital services for biodiversity and people from pollination to pest control.
The first global analysis of its kind, led by the UK Centre for Ecology & Hydrology (UKCEH), revealed that invasive alien species reduce abundance of terrestrial insects* by 31% on average and reduce species richness by 21%. Invasive animals outcompete or eat insects while invasive vegetation replaces ...
Polymer capsules can store functional substances such as drugs and fragrances, making them widely used in functional cosmetics and daily necessities. However, conventional capsules use non-degradable polymers and are difficult to decompose in natural environments. They are identified as a contributing factor to the marine microplastics problem, and concerns have been raised about their impact on the ecosystem and human health.
In search of an alternative, a research group led by Associate Professor Yukiya Kitayama at ...
With 440,000 employees and a value added of around 43 billion euros, the Austrian forestry and timber sector is a significant economic factor. However, the sector is suffering from a considerable shortage of skilled labour. “Forestry is characterised by physically demanding and sometimes dangerous work,” says Mario Hirz from the Institute of Automotive Engineering at Graz University of Technology (TU Graz), “Forestry companies cannot find enough people who are capable of carrying out the dangerous ...
A new paper in the Journal of Public Health, published by Oxford University Press, indicates that maternity services in the North of England most consistently report higher-than-average rates of perinatal mortality, including stillbirths, compared to those in the South.
The year 2025 marked the end of a decade-long UK government national maternity safety initiative, which aimed to halve the rate of stillbirths, neonatal and maternal deaths and brain injuries occurring during or soon after birth. While this was not achieved, a 36% reduction in perinatal ...