Medicine Technology 🌱 Environment Space Energy Physics Engineering Social Science Earth Science Science
Medicine 2026-03-04 3 min read

Machine learning spots HIV's immune signature with near-perfect accuracy, but the outliers tell the real story

A York University-led study used random forest models to classify vaccine responses in people with and without HIV, revealing individuals who defied their expected immune profiles.

York University

Two people in the HIV-positive group looked, immunologically speaking, like they did not have HIV at all. No matter how the researchers shuffled the data or swapped out biomarkers, the algorithm could not tell them apart from healthy controls. Their vaccine-induced immune responses were, by every measurable standard, indistinguishable from people without the virus.

Then there was the person in the healthy control group whose immune profile looked like someone living with HIV. No diagnosis. No known condition. Just a set of biomarkers that flagged something the clinical record had not yet caught.

These outliers are the most interesting part of a new study led by York University, published as the cover article in the journal Patterns. The main finding, that machine learning can classify HIV status from vaccine-elicited immune responses with near-perfect accuracy, is impressive but perhaps expected. The exceptions, the people who break the pattern, are where the science gets genuinely useful.

Saliva antibodies and the mucosal immune gap

Lead author Chapin Korosec, who conducted the work as a postdoctoral fellow under York University professor Jane Heffernan, used data from people with and without HIV who received up to five doses of COVID-19 vaccine over 100 weeks. All HIV-positive participants were from the Greater Toronto Area and had their virus controlled with antiretroviral therapy.

The team applied a random forest algorithm, a type of machine learning that builds many decision trees and aggregates their predictions, to analyze 64 immune biomarkers triggered by vaccination. The model identified saliva-based antibodies, particularly a class called IgA, combined with white blood cell counts, as the signature difference between the two groups.

That finding aligns with a growing body of research showing that HIV alters mucosal immunity in ways that persist even when the virus is well controlled by medication. The mucosal immune system, which lines the respiratory and digestive tracts, is the body's first line of defense against airborne pathogens. If vaccination produces weaker mucosal responses in people with HIV, that has direct implications for how well those vaccines actually protect them.

Virtual patients and the limits of real data

One challenge the researchers faced was that longitudinal immune data, measurements taken repeatedly over time from the same individuals, are inherently difficult to model. The dynamics are complex, and the data often cannot uniquely resolve what is happening inside any one person's immune system.

To work around this, the team used the patterns learned by the machine learning model to generate "virtual patients," synthetic immune profiles that capture the statistical structure of how immune responses differ between groups. This approach let them explore immune dynamics at a scale and resolution that the real dataset alone could not support.

Heffernan describes the broader challenge bluntly. The immune response is deeply complicated. A molecule that inhibits one arm of the immune system in one context can activate it in another. Individual variability is enormous, even among people with the same clinical status. The virtual patient approach offers a way to navigate that complexity without pretending it does not exist.

The ones who broke the model

The near-perfect classification rate makes the exceptions all the more striking. The two HIV-positive individuals whose immune responses were indistinguishable from controls suggest that, at least in terms of vaccination response, antiretroviral therapy had essentially restored their immune function. Understanding what sets these individuals apart, whether it is genetics, the timing of their treatment, or something else entirely, could inform how vaccines are tailored for immunocompromised populations.

The healthy control whose markers mimicked HIV raises a different and potentially more urgent question. It may indicate underlying immune dysfunction that has not yet surfaced clinically. If machine learning can flag such individuals before symptoms appear, the implications for early diagnosis extend well beyond HIV.

What this is not, and what comes next

This is not a diagnostic tool for HIV. The study population was small, the participants were all on effective antiretroviral therapy, and the immune signatures were specific to COVID-19 vaccine responses. Whether similar patterns hold for other vaccines or in untreated populations remains untested.

The study also cannot explain why the outliers exist. It identifies them, characterizes them, and raises hypotheses, but the mechanistic work of understanding what makes some immune systems defy their expected category is still ahead.

What the work does provide is a framework. By combining machine learning classification with mechanistic modeling and virtual patient generation, the researchers have built a pipeline that could be applied to other immunocompromised populations, other vaccines, and ultimately to the design of personalized vaccination strategies that account for the messy reality of individual immune variation.

Source: Korosec, C. et al. Patterns, 2026 (cover article, print March 13). York University, in collaboration with the National Research Council of Canada, Pennsylvania State University, University of Toronto, and St. Michael's Hospital. Supported by NRC-Fields Mathematical Sciences Collaboration Centre, NSERC, and AI4PH.