How do neural networks learn? A mathematical formula explains how they detect relevant patterns

The insights, published in the journal Science, can also be used to make other types of machine learning architectures more effective

2024-03-12

(Press-News.org) Neural networks have been powering breakthroughs in artificial intelligence, including the large language models that are now being used in a wide range of applications, from finance, to human resources to healthcare. But these networks remain a black box whose inner workings engineers and scientists struggle to understand. Now, a team led by data and computer scientists at the University of California San Diego has given neural networks the equivalent of an X-ray to uncover how they actually learn.

The researchers found that a formula used in statistical analysis provides a streamlined mathematical description of how neural networks, such as GPT-2, a precursor to ChatGPT, learn relevant patterns in data, known as features. This formula also explains how neural networks use these relevant patterns to make predictions.

“We are trying to understand neural networks from first principles,” said Daniel Beaglehole, a Ph.D. student in the UC San Diego Department of Computer Science and Engineering and co-first author of the study. “With our formula, one can simply interpret which features the network is using to make predictions.”

The team presented their findings in the March 7 issue of the journal Science.

Why does this matter? AI-powered tools are now pervasive in everyday life. Banks use them to approve loans. Hospitals use them to analyze medical data, such as X-rays and MRIs. Companies use them to screen job applicants. But it’s currently difficult to understand the mechanism neural networks use to make decisions and the biases in the training data that might impact this.

“If you don’t understand how neural networks learn, it’s very hard to establish whether neural networks produce reliable, accurate, and appropriate responses,” said Mikhail Belkin, the paper’s corresponding author and a professor at the UC San Diego Halicioglu Data Science Institute. “This is particularly significant given the rapid recent growth of machine learning and neural net technology.”

The study is part of a larger effort in Belkin’s research group to develop a mathematical theory that explains how neural networks work. “Technology has outpaced theory by a huge amount,” he said. “We need to catch up.”

The team also showed that the statistical formula they used to understand how neural networks learn, known as Average Gradient Outer Product (AGOP), could be applied to improve performance and efficiency in other types of machine learning architectures that do not include neural networks.

“If we understand the underlying mechanisms that drive neural networks, we should be able to build machine learning models that are simpler, more efficient and more interpretable,” Belkin said. “We hope this will help democratize AI.”

The machine learning systems that Belkin envisions would need less computational power, and therefore less power from the grid, to function. These systems also would be less complex and so easier to understand.

Illustrating the new findings with an example

(Artificial) neural networks are computational tools to learn relationships between data characteristics (i.e. identifying specific objects or faces in an image). One example of a task is determining whether in a new image a person is wearing glasses or not. Machine learning approaches this problem by providing the neural network many example (training) images labeled as images of “a person wearing glasses” or ”a person not wearing glasses.” The neural network learns the relationship between images and their labels, and extracts data patterns, or features, that it needs to focus on to make a determination. One of the reasons AI systems are considered a black box is because it is often difficult to describe mathematically what criteria the systems are actually using to make their predictions, including potential biases. The new work provides a simple mathematical explanation for how the systems are learning these features.

Features are relevant patterns in the data. In the example above, there are a wide range of features that the neural networks learns, and then uses, to determine if in fact a person in a photograph is wearing glasses or not. One feature it would need to pay attention to for this task is the upper part of the face. Other features could be the eye or the nose area where glasses often rest. The network selectively pays attention to the features that it learns are relevant and then discards the other parts of the image, such as the lower part of the face, the hair and so on.

Feature learning is the ability to recognize relevant patterns in data and then use those patterns to make predictions. In the glasses example, the network learns to pay attention to the upper part of the face. In the new Science paper, the researchers identified a statistical formula that describes how the neural networks are learning features.

Alternative neural network architectures: The researchers went on to show that inserting this formula into computing systems that do not rely on neural networks allowed these systems to learn faster and more efficiently.

“How do I ignore what’s not necessary? Humans are good at this,” said Belkin. “Machines are doing the same thing. Large Language Models, for example, are implementing this ‘selective paying attention’ and we haven’t known how they do it. In our Science paper, we present a mechanism explaining at least some of how the neural nets are ‘selectively paying attention.’”

Study funders included the National Science Foundation and the Simons Foundation for the Collaboration on the Theoretical Foundations of Deep Learning. Belkin is part of NSF-funded and UC San Diego-led The Institute for Learning-enabled Optimization at Scale, or TILOS.

Paper title: Mechanism for feature learning in neural networks and backpropagation-free machine learning models

Adit Radhakrishnan, Harvard School of Engineering and Applied Sciences and Broad Institute of MIT and Harvard

Daniel Beaglehole and Mikhail Belkin, University of California San Diego

Parthe Pandit: IIT Bombay–Pandit did the work for this paper as a postdoctoral researcher at the UC San Diego Halicioglu Data Science Institute

END

ELSE PRESS RELEASES FROM THIS DATE:

Vaccine monitoring crucial as SARS-CoV-2 variants continue to evolve

2024-03-12

Researchers at the Francis Crick Institute and the National Institute for Health and Care Research Biomedical Research Centre at UCLH have highlighted the importance of continued surveillance of emerging SARS-CoV-2 variants and vaccine performance as the virus continues to evolve. Published today as a research letter in The Lancet, their study compared the newer monovalent COVID vaccine, which specifically targets the XBB variant of Omicron (as recommended by the World Health Organisation), with older bivalent vaccines containing a mix of an Omicron variant and the original strain of COVID-19, which ...

Q&A: How Instagram influencers profit from anti-vaccine misinformation

2024-03-12

While Instagram might have a reputation for superficiality — a realm of exquisitely filtered images — it is now eclipsing other social media as a news source. The platform is increasingly filled with information, some of it pernicious and distributed via influencers. Researchers at the University of Washington studied three prominent Instagram influencers spreading anti-vaccine misinformation as a route to profit. Each account occupies what lead author Rachel E. Moran, a UW senior research scientist at the Center for an Informed Public (CIP) and staff researcher in the ...

Nancy Brown recognized as one of Modern Healthcare’s ‘Top Women Leaders’

2024-03-12

DALLAS, March 11, 2024 — Nancy Brown, Chief Executive Officer of the American Heart Association, a global force for healthier lives for all and celebrating one hundred years of lifesaving service, has been recognized by Modern Healthcare as one of the Top Women Leaders for 2024. The publication’s recognition program acknowledges and honors women executives from all sectors of the healthcare industry for their contributions to care delivery improvement, health equity, policy and gender equity in healthcare leadership. Since 2008, Brown has served as CEO of the Association, which is celebrating its 100th birthday in 2024. Awardees were selected based ...

India's water problems set to get worse as the world warms

2024-03-12

Winter storms that provide crucial snow and rainfall to northern India are arriving significantly later in the year compared to 70 years ago, a new study has found, exacerbating the risk of catastrophic flooding while also reducing vital water supplies for millions of Indians. The cyclonic storms, known as western disturbances, typically bring heavy snow to the Himalayas from December to March. This snowpack slowly melts in spring, providing a steady supply of irrigation water for wheat and other crops downstream. The study, published today (Tuesday, 12 March 2024), in the journal Weather and ...

GPS nanoparticle platform precisely delivers therapeutic payload to cancer cells

2024-03-11

UNIVERSITY PARK, Pa. — A newly developed “GPS nanoparticle” injected intravenously can home in on cancer cells to deliver a genetic punch to the protein implicated in tumor growth and spread, according to researchers from Penn State. They tested their approach in human cell lines and in mice to effectively knock down a cancer-causing gene, reporting that the technique may potentially offer a more precise and effective treatment for notoriously hard-to-treat basal-like breast cancers. They published their work today (March 11) in ACS Nano. They also filed a provisional application to patent the technology ...

New method for triggering and imaging seizures can help guide epilepsy surgery

2024-03-11

Researchers have developed a new method for triggering and imaging seizures in epilepsy patients, offering physicians the ability to collect real-time data to tailor epilepsy surgery. In contrast to previous practice, where physicians from neurology and nuclear medicine had to wait for hours to days in hopes of capturing the onset of a seizure, the new method is convenient, spares resources, and is clinically feasible. This research was published in the March issue of The Journal of Nuclear Medicine. People with epilepsy and seizures who do not respond to medication are often helped by brain surgery. The goal ...

Giving particle detectors a boost

2024-03-11

Device could help facilitate the operation of new particle colliders, such as the Electron-Ion Collider. In particle colliders that reveal the hidden secrets of the tiniest constituents of our universe, minute particles leave behind extremely faint electrical traces when they are generated in enormous collisions. Some detectors in these facilities use superconductivity — a phenomenon in which electricity is carried with zero resistance at low temperatures — to function. For scientists to more accurately observe the behavior of these particles, these weak electrical signals, or currents, need ...

Aging at AACR Annual Meeting 2024

2024-03-11

BUFFALO, NY- March 11, 2024 – Impact Journals publishes scholarly journals in the biomedical sciences with a focus on all areas of cancer and aging research. Aging is one of the most prominent journals published by Impact Journals. Impact Journals will be participating as an exhibitor at the American Association for Cancer Research (AACR) Annual Meeting 2024 from April 5-10 at the San Diego Convention Center in San Diego, California. This year, the AACR meeting theme is “Inspiring Science • Fueling ...

Oncotarget at AACR Annual Meeting 2024

2024-03-11

BUFFALO, NY- March 11, 2024 – Impact Journals publishes scholarly journals in the biomedical sciences with a focus on all areas of cancer and aging research. Oncotarget is one of the most prominent journals published by Impact Journals. Impact Journals will be participating as an exhibitor at the American Association for Cancer Research (AACR) Annual Meeting 2024 from April 5-10 at the San Diego Convention Center in San Diego, California. This year, the AACR meeting theme is “Inspiring Science • Fueling Progress • Revolutionizing Care.” Visit booth #4159 at the AACR Annual Meeting 2024 to connect with members of ...

Analysis reveals long-term impact of calcium and vitamin D supplements on health in postmenopausal women

2024-03-11

Embargoed for release until 5:00 p.m. ET on Monday 11 March 2024   Annals of Internal Medicine Tip Sheet    @Annalsofim   Below please find summaries of new articles that will be published in the next issue of Annals of Internal Medicine. The summaries are not intended to substitute for the full articles as a source of information. This information is under strict embargo and by taking it into possession, media representatives are committing to the terms of the embargo not only on their own behalf, but also ...

LAST 30 PRESS RELEASES:

Many patients want to talk about their faith. Neurologists often don't know how.

AI disclosure labels may do more harm than good

The ultra-high-energy neutrino may have begun its journey in blazars

Doubling of new prescriptions for ADHD medications among adults since start of COVID-19 pandemic

“Peculiar” ancient ancestor of the crocodile started life on four legs in adolescence before it began walking on two

AI can predict risk of serious heart disease from mammograms

New ultra-low-cost technique could slash the price of soft robotics

Increased connectivity in early Alzheimer’s is lowered by cancer drug in the lab

Study highlights stroke risk linked to recreational drugs, including among young users

Modeling brain aging and resilience over the lifespan reveals new individual factors

ESC launches guidelines for patients to empower women with cardiovascular disease to make informed pregnancy health decisions

Towards tailor-made heat expansion-free materials for precision technology

New research delves into the potential for AI to improve radiology workflows and healthcare delivery

Rice selected to lead US Space Force Strategic Technology Institute 4

A new clue to how the body detects physical force

Climate projections warn 20% of Colombia’s cocoa-growing areas could be lost by 2050, but adaptation options remain

New poll: American Heart Association most trusted public health source after personal physician

New ethanol-assisted catalyst design dramatically improves low-temperature nitrogen oxide removal

New review highlights overlooked role of soil erosion in the global nitrogen cycle

Biochar type shapes how water moves through phosphorus rich vegetable soils

Why does the body deem some foods safe and others unsafe?

Report examines cancer care access for Native patients

New book examines how COVID-19 crisis entrenched inequality for women around the world

Evolved robots are born to run and refuse to die

Study finds shared genetic roots of MS across diverse ancestries

Endocrine Society elects Wu as 2027-2028 President

Broad pay ranges in job postings linked to fewer female applicants

How to make magnets act like graphene

The hidden cost of ‘bullshit’ corporate speak

Greaux Healthy Day declared in Lake Charles: Pennington Biomedical’s Greaux Healthy Initiative highlights childhood obesity challenge in SWLA

[Press-News.org] How do neural networks learn? A mathematical formula explains how they detect relevant patterns
The insights, published in the journal Science, can also be used to make other types of machine learning architectures more effective