- Press Release Distribution

Artificial intelligence models to analyze cancer images take shortcuts that introduce bias

New study of artificial intelligence tools that analyze tumor images shows how they can make inaccurate predictions based on the institution that submitted the image

( Artificial intelligence tools and deep learning models are a powerful tool in cancer treatment. They can be used to analyze digital images of tumor biopsy samples, helping physicians quickly classify the type of cancer, predict prognosis and guide a course of treatment for the patient. However, unless these algorithms are properly calibrated, they can sometimes make inaccurate or biased predictions.

A new study led by researchers from the University of Chicago shows that deep learning models trained on large sets of cancer genetic and tissue histology data can easily identify the institution that submitted the images. The models, which use machine learning methods to "teach" themselves how to recognize certain cancer signatures, end up using the submitting site as a shortcut to predicting outcomes for the patient, lumping them together with other patients from the same location instead of relying on the biology of individual patients. This in turn may lead to bias and missed opportunities for treatment in patients from racial or ethnic minority groups who may be more likely to be represented in certain medical centers and already struggle with access to care.

"We identified a glaring hole in the in the current methodology for deep learning model development which makes certain regions and patient populations more susceptible to be included in inaccurate algorithmic predictions," said Alexander Pearson, MD, PhD, assistant Assistant Professor of Medicine at UChicago Medicine and co-senior author. The study was published July 20, in Nature Communications.

One of the first steps in treatment for a cancer patient is taking a biopsy, or small tissue sample of a tumor. A very thin slice of the tumor is affixed to glass slide, which is stained with multicolored dyes for review by a pathologist to make a diagnosis. Digital images can then be created for storage and remote analysis by using a scanning microscope. While these steps are mostly standard across pathology labs, minor variations in the color or amount of stain, tissue processing techniques and in the imaging equipment can create unique signatures, like tags, on each image. These location-specific signatures aren't visible to the naked eye, but are easily detected by powerful deep learning algorithms.

These algorithms have the potential to be a valuable tool for allowing physicians to quickly analyze a tumor and guide treatment options, but the introduction of this kind of bias means that the models aren't always basing their analysis on the biological signatures it sees in the images, but rather the image artifacts generated by differences between submitting sites.

Pearson and his colleagues studied the performance of deep learning models trained on data from the Cancer Genome Atlas, one of the largest repositories of cancer genetic and tissue image data. These models can predict survival rates, gene expression patterns, mutations, and more from the tissue histology, but the frequency of these patient characteristics varies widely depending on which institutions submitted the images, and the model often defaults to the "easiest" way to distinguish between samples - in this case, the submitting site.

For example, if Hospital A serves mostly affluent patients with more resources and better access to care, the images submitted from that hospital will generally indicate better patient outcomes and survival rates. If Hospital B serves a more disadvantaged population that struggles with access to quality care, the images that site submitted will generally predict worse outcomes.

The research team found that once the models identified which institution submitted the images, they tended to use that as a stand in for other characteristics of the image, including ancestry. In other words, if the staining or imaging techniques for a slide looked like it was submitted by Hospital A, the models would predict better outcomes, whereas they would predict worse outcomes if it looked like an image from Hospital B. Conversely, if all patients in Hospital B had biological characteristics based on genetics that indicated a worse prognosis, the algorithm would link the worse outcomes to Hospital B's staining patterns instead of things it saw in the tissue.

"Algorithms are designed to find a signal to differentiate between images, and it does so lazily by identifying the site," Pearson said. "We actually want to understand what biology within a tumor is more likely to predispose resistance to treatment or early metastatic disease, so we have to disentangle that site-specific digital histology signature from the true biological signal."

The key to avoiding this kind of bias is to carefully consider the data used to train the models. Developers can make sure that different disease outcomes are distributed evenly across all sites used in the training data, or by isolating a certain site while training or testing the model when the distribution of outcomes is unequal. The result will produce more accurate tools that can get physicians the information they need to quickly diagnose and plan treatments for cancer patients.

"The promise of artificial intelligence is the ability to bring accurate and rapid precision health to more people," Pearson said. "In order to meet the needs of the disenfranchised members of our society, however, we have to be able to develop algorithms which are competent and make relevant predictions for everyone."


The study, "The Impact of Site-Specific Digital Histology Signatures on Deep Learning Model Accuracy and Bias," was supported by the National Institutes of Health, the National Cancer Institute, the Adenoid Cystic Carcinoma Research Foundation, the Cancer Research Foundation, and the American Cancer Society. Additional authors include Frederick M. Howard, James Dolezal, Sara Kochanny, Jefree Schulte, Heather Chen, Dezheng Huo, Rita Nanda, Olufunmilayo I. Olopade, Nicole Cipriani, and Robert L. Grossman from the University of Chicago; Lara Heij from University Hospital RWTH Aachen, Germany; and Jakob N. Kather from University Hospital RWTH Aachen, Germany, University of Leeds, United Kingdom, and University Hospital Heidelberg, Germany.

About the University of Chicago Medicine & Biological Sciences The University of Chicago Medicine, with a history dating back to 1927, is one of the nation's leading academic health systems. It unites the missions of the University of Chicago Medical Center, Pritzker School of Medicine and the Biological Sciences Division. Twelve Nobel Prize winners in physiology or medicine have been affiliated with the University of Chicago Medicine. Its main Hyde Park campus is home to the Center for Care and Discovery, Bernard Mitchell Hospital, Comer Children's Hospital and the Duchossois Center for Advanced Medicine. It also has ambulatory facilities in Orland Park, South Loop and River East as well as affiliations and partnerships that create a regional network of care. UChicago Medicine offers a full range of specialty-care services for adults and children through more than 40 institutes and centers including an NCI-designated Comprehensive Cancer Center. Together with Harvey-based Ingalls Memorial, UChicago Medicine has 1,296 licensed beds, nearly 1,300 attending physicians, over 2,800 nurses and about 970 residents and fellows.

Visit UChicago Medicine's health and science news blog at Twitter @UChicagoMed


Interaction identified between SARS-CoV-2 and unusual RNA structures in human cells

Replication of SARS-CoV-2, the virus responsible for COVID-19, depends on a series of interactions between viral proteins and different cellular partners such as nucleic acids (DNA or RNA). Characterizing these interactions is crucial to elucidate the process of viral replication and identify new drugs for treating COVID-19. An interdisciplinary consortium of scientists from the Institut Pasteur, the Ecole Polytechnique, the Institut Curie, Inserm, the CNRS and the universities of Paris, Paris-Saclay, Bordeaux and Toulouse have demonstrated a specific interaction between a domain of a SARS-CoV-2 protein (Nsp3) and ...

Characterized drugs show unexpected effects

Characterized drugs show unexpected effects
When Alexander Flemming discovered a mould on a culture plate overgrown with bacteria in 1928, he did not expect to find one of the most widely used active substances: penicillin. Accidental discoveries and the identification of active ingredients from traditional remedies, such as the morphine of the opium poppy, have shaped the discovery of new medicines for a long time. Modern drug discovery - from chance to system Meanwhile, major developments in chemistry and molecular biology have been made that enable a systematic and targeted search for potential active substances in modern drug discovery. First, advances in the field of organic and ...

Rensselaer-designed platform could enable personalized immunotherapy

TROY, N.Y. -- An innovative testing platform that more closely mimics what cancer encounters in the body may allow for more precise, personalized therapies by enabling the rapid study of multiple therapeutic combinations against tumor cells. The platform, which uses a three-dimensional environment to more closely mirror a tumor microenvironment, is demonstrated in research published in Communications Biology. "This whole platform really gives us a way to optimize personalized immunotherapy on a rapid, high throughput scale," said Jonathan Dordick, Institute Professor of chemical and biological engineering and member of the Center for Biotechnology and Interdisciplinary Studies (CBIS) at Rensselaer Polytechnic Institute, ...

Excess coffee: A bitter brew for brain health

It's a favourite first-order for the day, but while a quick coffee may perk us up, new research from the University of South Australia shows that too much could be dragging us down, especially when it comes to brain health. In the largest study of its kind, researchers have found that high coffee consumption is associated with smaller total brain volumes and an increased risk of dementia. Conducted at UniSA's Australian Centre for Precision Health at SAHMRI and a team of international researchers*, the study assessed the effects of coffee on the brain among 17,702 UK Biobank participants (aged 37-73), finding that those who drank more than six cups of coffee a day had a 53 per cent increased risk of dementia. Lead researcher and UniSA PhD candidate, ...

Silicon with a two-dimensional structure

Silicon with a two-dimensional structure
Silicon, a semi-metal, bonds in its natural form with four other elements and its three-dimensional structure takes the form of a tetrahedron. For a long time, it seemed impossible to achieve the synthesis and characterisation of a two-dimensional equivalent - geometrically speaking, a square. Now scientists from the field of Inorganic Chemistry at Heidelberg University have succeeded in producing a crystalline complex with such a configuration. PD Dr Lutz Greb from the Institute of Inorganic Chemistry underlines that it has surprising physical and chemical properties and, in the field of molecular chemistry, will open up new approaches to using the second most abundant element in the Earth's crust for catalysis and materials research. As a classical ...

Smartphone screens effective sensors for soil or water contamination

The touchscreen technology used in billions of smartphones and tablets could also be used as a powerful sensor, without the need for any modifications. Researchers from the University of Cambridge have demonstrated how a typical touchscreen could be used to identify common ionic contaminants in soil or drinking water by dropping liquid samples on the screen, the first time this has been achieved. The sensitivity of the touchscreen sensor is comparable to typical lab-based equipment, which would make it useful in low-resource settings. The researchers say their proof of concept could one day be expanded for a wide range of sensing applications, including for biosensing or medical diagnostics, right from the phone in your pocket. The results are reported ...

Infrared held in a pincer

Many applications, from fiber-optic telecommunications to biomedical imaging processes require substances that emit light in the near-infrared range (NIR). A research team in Switzerland has now developed the first chromium complex that emits light in the coveted, longer wavelength NIR-II range. In the journal Angewandte Chemie, the team has introduced the underlying concept: a drastic change in the electronic structure of the chromium caused by the specially tailored ligands that envelop it. Many materials that emit NIR light are based on expensive or rare metal complexes. Cheaper alternatives that emit in the NIR-I range between 700 and 950 nm have been developed but NIR-II-emitting complexes of non-precious metals remain extremely rare. Luminescence in the NIR-II range (1000 to 1700 ...

Alzheimer-linked enzyme complex 'buckles up' for safe trip through the cell

A research team led by Wim Annaert (VIB-KU Leuven) uncovered the early assembly of gamma-secretase, a protein complex linked to numerous cellular processes including the development of Alzheimer's disease. In a first step, two dimeric subcomplexes are formed, which independently exit the ER and only afterwards assemble into a four-subunit complex. This 'buckle up' mechanism is thought to prevent premature assembly and activity. The new insights are very relevant, as gamma-secretase is an important potential therapeutic target for Alzheimer's and other ...

HKU scientists harness the naturally abundant CRISPR-Cas system to edit superbugs with the hope of treating infections caused by drug resistant pathogens

HKU scientists harness the naturally abundant CRISPR-Cas system to edit superbugs with the hope of treating infections caused by drug resistant pathogens
A research team led by Dr Aixin YAN, Associate Professor from the Research Division for Molecular & Cell Biology, Faculty of Science, in collaboration with Honorary Clinical Professor Patrick CY WOO from the Department of Microbiology, Li Ka Shing Faculty of Medicine, the University of Hong Kong (HKU), reported the development of a transferrable and integrative type I CRISPR-based platform that can efficiently edit the diverse clinical isolates of Pseudomonas aeruginosa, a superbug capable of infecting various tissues and organs and a major source of nosocomial infections. The ...

NTU Singapore scientists develop tougher, safer bicycle helmets using new plastic material

NTU Singapore scientists develop tougher, safer bicycle helmets using new plastic material
As cities worldwide expand their networks of cycling paths and more cyclists take to the streets, the chances of cycling accidents and potential collisions increase as well, underscoring the need for proper cycling safety in dense urban areas. According to a World Health Organisation report in 2020, more than 60 per cent of the reported bicycle-related deaths and long-term disabilities are a result of accidents with head injuries. Researchers from Nanyang Technological University, Singapore (NTU Singapore), in collaboration with French specialty materials leader Arkema, have developed a tougher, safer bicycle helmet using a combination of materials. The new helmet prototype has higher energy absorption, reducing the amount of energy ...


Scientists model 'true prevalence' of COVID-19 throughout pandemic

New breakthrough to help immune systems in the fight against cancer

Through the thin-film glass, researchers spot a new liquid phase

Administering opioids to pregnant mice alters behavior and gene expression in offspring

Brain's 'memory center' needed to recognize image sequences but not single sights

Safety of second dose of mRNA COVID-19 vaccines after first-dose allergic reactions

Changes in disparities in access to care, health after Medicare eligibility

Use of high-risk medications among lonely older adults

65+ and lonely? Don't talk to your doctor about another prescription

Exosome formulation developed to deliver antibodies for choroidal neovascularization therapy

Second COVID-19 mRNA vaccine dose found safe following allergic reactions to first dose

Plant root-associated bacteria preferentially colonize their native host-plant roots

Rare inherited variants in previously unsuspected genes may confer significant risk for autism

International experts call for a unified public health response to NAFLD and NASH epidemic

International collaboration of scientists rewrite the rulebook of flowering plant genetics

Improving air quality reduces dementia risk, multiple studies suggest

Misplaced trust: When trust in science fosters pseudoscience

Two types of blood pressure meds prevent heart events equally, but side effects differ

New statement provides path to include ethnicity, ancestry, race in genomic research

Among effective antihypertensive drugs, less popular choice is slightly safer

Juicy past of favorite Okinawan fruit revealed

Anticipate a resurgence of respiratory viruses in young children

Anxiety, depression, burnout rising as college students prepare to return to campus

Goal-setting and positive parent-child relationships reduce risk of youth vaping

New research identifies cancer types with little survival improvements in adolescents and young adul

Oncotarget: Replication-stress sensitivity in breast cancer cells

Oncotarget: TERT and its binding protein: overexpression of GABPA/B in gliomas

Development of a novel technology to check body temperature with smartphone camera

The mechanics of puncture finally explained

Extreme heat, dry summers main cause of tree death in Colorado's subalpine forests

[] Artificial intelligence models to analyze cancer images take shortcuts that introduce bias
New study of artificial intelligence tools that analyze tumor images shows how they can make inaccurate predictions based on the institution that submitted the image