PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

Widely used machine learning models reproduce dataset bias in Rice study

High-income communities overrepresented in relevant datasets for immunotherapy research

2024-02-16
(Press-News.org) HOUSTON – (Feb. 16, 2024) – Rice University computer science researchers have found bias in widely used machine learning tools used for immunotherapy research.

Ph.D. students Anja Conev, Romanos Fasoulis and Sarah Hall-Swan, working with computer science faculty members Rodrigo Ferreira and Lydia Kavraki, reviewed publicly available peptide-HLA (pHLA) binding prediction data and found it to be skewed toward higher-income communities. Their paper examines the way that biased data input affects the algorithmic recommendations being used in important immunotherapy research.

Peptide-HLA binding prediction, machine learning and immunotherapy

HLA is a gene in all humans that encodes proteins working as part of our immune response. Those proteins bind with protein chunks called peptides in our cells and mark our infected cells for the body’s immune system, so it can respond and, ideally, eliminate the threat.

Different people have slightly different variants in genes, called alleles. Current immunotherapy research is exploring ways to identify peptides that can more effectively bind with the HLA alleles of the patient.

The end result, eventually, could be custom and highly effective immunotherapies. That is why one of the most critical steps is to accurately predict which peptides will bind with which alleles. The greater the accuracy, the better the potential efficacy of the therapy.

But calculating how effectively a peptide will bind to the HLA allele takes a lot of work, which is why machine learning tools are being used to predict binding. This is where Rice’s team found a problem: The data used to train those models appears to geographically favor higher-income communities.

Why is this an issue? Without being able to account for genetic data from lower-income communities, future immunotherapies developed for them may not be as effective.

“Each and every one of us has different HLAs that they express, and those HLAs vary between different populations,” Fasoulis said. “Given that machine learning is used to identify potential peptide candidates for immunotherapies, if you basically have biased machine models, then those therapeutics won’t work equally for everyone in every population.”

Redefining ‘pan-allele’ binding predictors

Regardless of the application, machine learning models are only as good as the data you feed them. A bias in the data, even an unconscious one, can affect the conclusions made by the algorithm.

Machine learning models currently being used for pHLA binding prediction assert that they can extrapolate for allele data not present in the dataset those models were trained on, calling themselves “pan-allele” or “all-allele.” The Rice team’s findings call that into question.

“What we are trying to show here and kind of debunk is the idea of the ‘pan-allele’ machine learning predictors,” Conev said. “We wanted to see if they really worked for the data that is not in the datasets, which is the data from lower-income populations.”

Fasoulis’ and Conev’s group tested publicly available data on pHLA binding prediction, and their findings supported their hypothesis that a bias in the data was creating an accompanying bias in the algorithm. The team hopes that by bringing this discrepancy to the attention of the research community, a truly pan-allele method of predicting pHLA binding can be developed.

Ferreira, faculty advisor and paper co-author, explained that the problem of bias in machine learning can’t be addressed unless researchers think about their data in a social context. From a certain perspective, datasets may appear as simply “incomplete,” but making connections between what is or what is not represented in the dataset and underlying historical and economic factors affecting the populations from which data was collected is key to identifying bias.

“Researchers using machine learning models sometimes innocently assume that these models may appropriately represent a global population,” Ferreira said, “but our research points to the significance of when this is not the case.” He added that “even though the databases we studied contain information from people in multiple regions of the world, that does not make them universal. What our research found was a correlation between the socioeconomic standing of certain populations and how well they were represented in the databases or not.”

Professor Kavraki echoed this sentiment, emphasizing how important it is that tools used in clinical work be accurate and honest about any shortcomings they may have.

“Our study of pHLA binding is in the context of personalized immunotherapies for cancer — a project done in collaboration with MD Anderson,” Kavraki said. “The tools developed eventually make their way to clinical pipelines. We need to understand the biases that may exist in these tools. Our work also aims to alert the research community on the difficulties of obtaining unbiased datasets.”

Conev noted that, though biased, the fact that the data was publicly available for her team to review was a good start. The team is hoping its findings will lead new research in a positive direction — one that includes and helps people across demographic lines.

Ferreira is an assistant teaching professor of computer science. Kavraki is the Noah Harding Professor of Computer Science, a professor of bioengineering, electrical and computer engineering and director of the Ken Kennedy Institute for Information Technology.

The research was supported by the National Institutes of Health (U01CA258512) and Rice University.

-30-


This release was authored by John Bogna and can be found online at news.rice.edu.

Follow Rice News and Media Relations via Twitter @RiceUNews.

Peer-reviewed paper:

HLAEquity: Examining biases in pan-allele peptide-HLA binding predictors | iScience | DOI: 10.1016/j.isci.2023.108613

Authors: Anja Conev, Romanos Fasoulis, Sarah Hall-Swan, Rodrigo Ferreira and Lydia Kavraki

https://www.cell.com/iscience/fulltext/S2589-0042(23)02690-1?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS2589004223026901%3Fshowall%3Dtrue

Image downloads:

https://news-network.rice.edu/news/files/2024/02/pHLA-1-76111b2f602ee562.jpg
CAPTION: Lydia Kavraki (top row, left), Rodrigo Ferreira (top row, right), Romanos Fasoulis (bottom row, from left), Sarah Hall-Swan and Anja Conev (Rice University)

Links:

Department of Computer Science: https://csweb.rice.edu/
Department of Bioengineering: https://bioengineering.rice.edu/
Department of Electrical and Computer Engineering: https://eceweb.rice.edu/
Department of Mechanical Engineering: https://mech.rice.edu/
George R. Brown School of Engineering: https://engineering.rice.edu/
The Ken Kennedy Institute: https://kenkennedy.rice.edu/
Kavraki Lab: https://www.kavrakilab.org/

About Rice:

Located on a 300-acre forested campus in Houston, Rice University is consistently ranked among the nation’s top 20 universities by U.S. News & World Report. Rice has highly respected schools of architecture, business, continuing studies, engineering, humanities, music, natural sciences and social sciences and is home to the Baker Institute for Public Policy. With 4,574 undergraduates and 3,982 graduate students, Rice’s undergraduate student-to-faculty ratio is just under 6-to-1. Its residential college system builds close-knit communities and lifelong friendships, just one reason why Rice is ranked No. 1 for lots of race/class interaction, No. 2 for best-run colleges and No. 12 for quality of life by the Princeton Review. Rice is also rated as a best value among private universities by Kiplinger’s Personal Finance.

END


ELSE PRESS RELEASES FROM THIS DATE:

Study finds risk-reducing mastectomy (RRM) may lower breast cancer mortality

Study finds risk-reducing mastectomy (RRM) may lower breast cancer mortality
2024-02-16
A study co-led by Professor Kelly Metcalfe of the Lawrence Bloomberg Faculty of Nursing, and researchers at the Familial Breast Cancer Research Unit at Women’s College Hospital, finds risk-reducing mastectomies (RRM) in women with a BRCA1 or BRCA2 genetic variant, significantly reduces the risk of being diagnosed with breast cancer and lowers the probability of death. The study, published in the British Journal of Cancer, examined how RRM affects the rate of death of women with a pathogenic variant but no cancer diagnosis. To date, there has been only one other study published by researchers in the Netherlands that examines the impact ...

Hope Foundation announces Goodman for Inaugural Meyskens Lecture

2024-02-16
The Hope Foundation for Cancer Research, the public charity supporting SWOG Cancer Research Network, has recently established the Frank and Linda Meyskens Annual Endowed Lectureship on Advances in Cancer Prevention. Since the early 1980’s, Dr. Meyskens has been a leader in the recognition, development, and clinical usage of Prevention in the management of cancer. Frank and Linda have directed the development of this lectureship to advance the explosion of knowledge that is expanding opportunities to engage Prevention in personalized medicine, including early detection and genetically ...

Rare case of opossum infected by rabies sounds alarm regarding circulation of this virus in urban environments

Rare case of opossum infected by rabies sounds alarm regarding circulation of this virus in urban environments
2024-02-16
A female White-eared opossum (Didelphis albiventris) found dead in 2021 in Bosque dos Jequitibás Park in the center of Campinas, one of the largest cities in São Paulo state, Brazil, died from rabies meningoencephalitis, according to a group of researchers at the University of São Paulo (USP) and Adolfo Lutz Institute (IAL), the regional reference laboratory, working with health professionals affiliated with public institutions in São Paulo city and Campinas.  Reported in an article published in the journal Emerging Infectious Diseases, the finding serves as an alert to the presence of the virus, which is deadly to humans, ...

Targeting 'undruggable' proteins promises new approach for treating neurodegenerative diseases

Targeting undruggable proteins promises new approach for treating neurodegenerative diseases
2024-02-16
Researchers led by Northwestern University and the University of Wisconsin-Madison have introduced a pioneering approach aimed at combating neurodegenerative diseases such as Alzheimer's disease, Parkinson's disease and Amyotrophic lateral sclerosis (ALS). In a new study, researchers discovered a new way to enhance the body’s antioxidant response, which is crucial for cellular protection against the oxidative stress implicated in many neurodegenerative diseases. The study published today (Feb. 16) in the journal Advanced Materials.  Nathan Gianneschi, the Jacob & Rosaline Cohn Professor of Chemistry at Northwestern’s Weinberg ...

Anoxic marine basins are among the best candidates for deep-sea carbon sequestration

2024-02-16
(Santa Barbara, Calif.) — Anoxic marine basins may be among the most viable places to conduct large-scale carbon sequestration in the deep ocean, while minimizing negative impacts to marine life. So say UC Santa Barbara researchers in a paper published in the journal AGU Advances. As we explore ways to actively draw down the levels of carbon in the atmosphere, sending plant biomass to these barren, oxygen-free zones on the seafloor becomes an option worth considering. “The big picture here is that all the best models that we have say that we have to do some form of net negative CO2 removal in order to hit climate goals,” said geochemist, geobiologist ...

NIH trial data underpins FDA approval of omalizumab for food allergy

2024-02-16
Today’s Food and Drug Administration approval of a supplemental biologics license for the monoclonal antibody omalizumab (Xolair) highlights the vital role of the National Institutes of Health-supported research that underpins the FDA decision.    FDA has approved omalizumab for the reduction of allergic reactions, including anaphylaxis, that may occur with an accidental exposure to one or more foods in adults and children aged 1 year and older with food allergy. People taking omalizumab still need to avoid exposure to foods to which they are allergic. Omalizumab previously received FDA approval ...

Moffitt study finds neoadjuvant chemotherapy significantly improves outcomes for penile squamous cell carcinoma patients

2024-02-16
TAMPA, Fla. — Penile squamous cell carcinoma is a rare malignancy with limited treatment options and poor prognosis, especially in advanced stages. Because of its rarity, few studies focus on better understanding and managing this disease. In a new article published in the Journal of the National Cancer Institute, Moffitt Cancer Center researchers share data on the efficacy and safety of neoadjuvant chemotherapy for locally advanced penile squamous cell carcinoma, addressing a critical gap in evidence regarding treatment options for this rare and aggressive cancer. The Moffitt team, in collaboration with institutions across ...

New paper proposes standards – and actionable clinical tools – for biomarkers of aging

New paper proposes standards – and actionable clinical tools – for biomarkers of aging
2024-02-16
A new paper led by Harvard researchers has zeroed in on biomarkers of aging using omic data from population-based studies. The team, which included aging and longevity expert Alex Zhavoronkov, PhD, founder and CEO of AI-driven drug discovery company Insilico Medicine, provided a framework for standardizing the development and validation of biomarkers of aging to better predict longevity and quality of life. The findings appeared in Nature Medicine.  Biomarkers are biological characteristics that can be measured and used to evaluate various biological processes, ...

Researchers identify genes and cell types that may have causal role in primary open-angle glaucoma formation

2024-02-16
Although primary open-angle glaucoma (POAG) is the leading cause of blindness in people over the age of 55, there remains no cure for the disease and its biological mechanisms are not well understood. Elevated intraocular pressure (IOP) is a major risk factor for the disease, but many patients with glaucoma have normal eye pressure and still lose vision. In a new study published this past month in Nature Communications, researchers from Mass Eye and Ear, led by Ayellet Segrè, PhD, conducted a comprehensive study that combined genetic discoveries from a large cross-ancestry genome-wide association study meta-analysis of POAG, led by Janey Wiggs, MD, PhD, ...

New book helps school leaders focus on what they can do without getting weighed down

2024-02-16
LAWRENCE — No one can do everything. Yet that is exactly what many school leaders feel like they must do. A new book from a pair of school leaders and scholars aims to help those who often feel overwhelmed focus on what they can and should do and how to help teachers and students lead schools to reaching their full potential. “Focused: Understanding, Negotiating, and Maximizing Your Influence as a School Leader,” by Jim Watterston and Yong Zhao, aims to help educational administrators guide schools to success without getting ...

LAST 30 PRESS RELEASES:

Study shows psychedelic drug psilocybin gives comparable long-term antidepressant effects to standard antidepressants, but may offer additional benefits

Study finds symptoms of depression during pregnancy linked to specific brain activity: scientists hope to develop test for “baby blues” risk

Sexual health symptoms may correlate with poor adherence to adjuvant endocrine therapy in Black women with breast cancer

Black patients with triple-negative breast cancer may be less likely to receive immunotherapy than white patients

Affordable care act may increase access to colon cancer care for underserved groups

UK study shows there is less stigma against LGBTQ people than you might think, but people with mental health problems continue to experience higher levels of stigma

Bringing lost proteins back home

Better than blood tests? Nanoparticle potential found for assessing kidneys

Texas A&M and partner USAging awarded 2024 Immunization Neighborhood Champion Award

UTEP establishes collaboration with DoD, NSA to help enhance U.S. semiconductor workforce

Study finds family members are most common perpetrators of infant and child homicides in the U.S.

Researchers secure funds to create a digital mental health tool for Spanish-speaking Latino families

UAB startup Endomimetics receives $2.8 million Small Business Innovation Research grant

Scientists turn to human skeletons to explore origins of horseback riding

UCF receives prestigious Keck Foundation Award to advance spintronics technology

Cleveland Clinic study shows bariatric surgery outperforms GLP-1 diabetes drugs for kidney protection

Study reveals large ocean heat storage efficiency during the last deglaciation

Fever drives enhanced activity, mitochondrial damage in immune cells

A two-dose schedule could make HIV vaccines more effective

Wastewater monitoring can detect foodborne illness, researchers find

Kowalski, Salonvaara receive ASHRAE Distinguished Service Awards

SkAI launched to further explore universe

SLU researchers identify sex-based differences in immune responses against tumors

Evolved in the lab, found in nature: uncovering hidden pH sensing abilities

Unlocking the potential of patient-derived organoids for personalized sarcoma treatment

New drug molecule could lead to new treatments for Parkinson’s disease in younger patients

Deforestation in the Amazon is driven more by domestic demand than by the export market

Demand-side actions could help construction sector deliver on net-zero targets

Research team discovers molecular mechanism for a bacterial infection

What role does a tailwind play in cycling’s ‘Everesting’?

[Press-News.org] Widely used machine learning models reproduce dataset bias in Rice study
High-income communities overrepresented in relevant datasets for immunotherapy research