Medicine Technology 🌱 Environment Space Energy Physics Engineering Social Science Earth Science Science
Medicine 2026-02-17 4 min read

AI Reads Clinical Notes to Predict Colorectal Cancer Risk in Colitis Patients

Using large language models on 55,000 VA patient records, UC San Diego researchers built an automated risk-scoring pipeline that correctly classified nearly half of ulcerative colitis patients as low-risk - with 99% accuracy for two-year cancer-free outcomes.

Among people with ulcerative colitis, the risk of colorectal cancer runs two to four times higher than in the general population. The elevated risk comes partly from chronic intestinal inflammation and partly from a tendency to develop low-grade dysplasia - patches of abnormal cells in the colon lining that can, in some cases, progress to invasive cancer. The clinical problem is figuring out which patients are actually in danger. Most people with low-grade dysplasia never develop cancer. A small fraction do. Distinguishing between them with enough confidence to guide decisions about surgery or surveillance intervals has not been easy.

A study published February 17 in Clinical Gastroenterology and Hepatology tested whether artificial intelligence could extract and synthesize the relevant risk information from existing electronic health records - without any manual chart review. The answer, across a dataset of 55,000 patients in the US Department of Veterans Affairs healthcare system, was yes.

What the AI pipeline actually does

The research team at UC San Diego built a fully automated workflow that combines large language models with a biostatistical risk model. The language models read through free-text clinical notes - colonoscopy reports, pathology findings, physician assessments - and extract four specific risk factors that prior research has identified as predictive: the size of the dysplastic lesion, whether the lesion was completely and visibly resectable, the number of dysplastic sites, and the severity of colon inflammation.

Those extracted variables feed into a statistical model that generates a numerical risk score and assigns each patient to one of five risk categories. The model's predictions were then compared against actual patient outcomes over more than a decade of follow-up in the VA records.

"Large language models accurately derived colitis-associated colorectal cancer risk factors - such as how big the low-grade dysplasia lesion is, whether there are multiple lesions and if the colon is extremely inflamed - from the narrative clinical notes themselves," said Kit Curtius, PhD, assistant professor of medicine in the Division of Biomedical Informatics at UC San Diego School of Medicine and a member of Moores Cancer Center.

Key performance figures

The model correctly sorted patients into five risk tiers, and its predictions matched observed outcomes over more than ten years of follow-up. Nearly half of the 55,000 patients fell into the lowest-risk category. Of those classified as lowest risk, almost 99% remained cancer-free within two years after diagnosis - validating the model's ability to identify patients who do not need intensive surveillance.

At the other end of the risk spectrum, the model identified a group with unresectable visible lesions - tumors that could not be safely or completely removed during colonoscopy due to size, location, or extent of spread. Those patients carried substantially higher cancer risk than clinicians had typically estimated for them.

The VA dataset, at 55,000 patients, is the largest UC-LGD dataset of its kind in the United States, which gives the findings more statistical weight than smaller institutional series. The study was funded in part by the VA Biomedical Laboratory Research and Development Service (Merit Review Award I01 BX005958) and the NIH (R01 CA270235, P30 CA023100, T15LM011271, P30 DK120515).

The surveillance interval question

Current clinical guidelines typically recommend that UC patients with small dysplastic lesions return for surveillance colonoscopy within two years. For patients who spend years cycling through surveillance colonoscopies they don't strictly need, that schedule is burdensome and adds system cost. If an AI tool can identify with high confidence which patients are genuinely low risk, extending the surveillance interval for that group becomes defensible.

"A lot of people are low risk - they have small dysplastic lesions - and it's been hard to know what to confidently tell these people until now," Curtius said. "With this tool, there may be a potential to increase the surveillance interval so patients who are at this low risk don't have to come back so often."

The flip side - catching high-risk patients who are falling through gaps in follow-up - is also part of the value proposition. Delayed colonoscopies are a known contributor to preventable colorectal cancer deaths. An automated system that flags overdue high-risk patients could reduce those gaps.

Limitations and next steps

The study was conducted entirely within the VA healthcare system, which predominantly serves male veterans. That population may not reflect the full range of UC patients seen in community gastroenterology practices, academic medical centers, or non-VA settings. The researchers plan to validate the pipeline in outside patient populations.

The current model also relies on the four established clinical risk factors. Genetic variants - including mutations in the adenomatous polyposis coli gene and microsatellite instability markers - are known to influence cancer progression in UC patients but were not incorporated into this version of the tool. Adding genomic data is a stated next step.

"We know that genomics play a big part in driving cancer progression," Curtius said.

What the study does not address is whether patients who receive AI-generated risk scores actually change their clinical decision-making in ways that improve outcomes - that question requires prospective study with randomized assignment or at minimum a controlled comparison. The current work establishes that the tool can accurately extract risk factors and generate predictions; whether deploying it changes patient outcomes at scale remains to be tested.

Source: University of California San Diego. Study published in Clinical Gastroenterology and Hepatology, February 17, 2026. Contact: Susanne Bard, sbard@ucsd.edu, (202) 441-8976.