PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

How do neural networks learn? A mathematical formula explains how they detect relevant patterns

The insights, published in the journal Science, can also be used to make other types of machine learning architectures more effective

2024-03-12
(Press-News.org) Neural networks have been powering breakthroughs in artificial intelligence, including the large language models that are now being used in a wide range of applications, from finance, to human resources to healthcare. But these networks remain a black box whose inner workings engineers and scientists struggle to understand. Now, a team led by data and computer scientists at the University of California San Diego has given neural networks the equivalent of an X-ray to uncover how they actually learn. 

The researchers found that a formula used in statistical analysis provides a streamlined mathematical description of how neural networks, such as GPT-2, a precursor to ChatGPT, learn relevant patterns in data, known as features. This formula also explains how neural networks use these relevant patterns to make predictions. 

“We are trying to understand neural networks from first principles,” said Daniel Beaglehole, a Ph.D. student in the UC San Diego Department of Computer Science and Engineering and co-first author of the study. “With our formula, one can simply interpret which features the network is using to make predictions.”  

The team presented their findings in the March 7 issue of the journal Science. 

Why does this matter? AI-powered tools are now pervasive in everyday life. Banks use them to approve loans. Hospitals use them to analyze medical data, such as X-rays and MRIs. Companies use them to screen job applicants. But it’s currently difficult to understand the mechanism neural networks use to make decisions and the biases in the training data that might impact this. 

“If you don’t understand how neural networks learn, it’s very hard to establish whether neural networks produce reliable, accurate, and appropriate responses,” said Mikhail Belkin, the paper’s corresponding author and a professor at the UC San Diego Halicioglu Data Science Institute. “This is particularly significant given the rapid recent growth of machine learning and neural net technology.”

The study is part of a larger effort in Belkin’s research group to develop a mathematical theory that explains how neural networks work. “Technology has outpaced theory by a huge amount,” he said. “We need to catch up.” 

 

The team also showed that the statistical formula they used to understand how neural networks learn, known as Average Gradient Outer Product (AGOP), could be applied to improve performance and efficiency in other types of machine learning architectures that do not include neural networks.

 

“If we understand the underlying mechanisms that drive neural networks, we should be able to build machine learning models that are simpler, more efficient and more interpretable,” Belkin said. “We hope this will help democratize AI.”

The machine learning systems that Belkin envisions would need less computational power, and therefore less power from the grid, to function. These systems also would be less complex and so easier to understand. 

Illustrating the new findings with an example

(Artificial) neural networks are computational tools to learn relationships between data characteristics (i.e. identifying specific objects or faces in an image). One example of a task is determining whether in a new image a person is wearing glasses or not. Machine learning approaches this problem by providing the neural network many example (training) images labeled as images of “a person wearing glasses” or ”a person not wearing glasses.” The neural network learns the relationship between images and their labels, and extracts data patterns, or features, that it needs to focus on to make a determination. One of the reasons AI systems are considered a black box is because it is often difficult to describe mathematically what criteria the systems are actually using to make their predictions, including potential biases. The new work provides a simple mathematical explanation for how the systems are learning these features.

Features are relevant patterns in the data. In the example above, there are a wide range of features that the neural networks learns, and then uses, to determine if in fact a person in a photograph is wearing glasses or not. One feature it would need to pay attention to for this task is the upper part of the face. Other features could be the eye or the nose area where glasses often rest. The network selectively pays attention to the features that it learns are relevant and then discards the other parts of the image, such as the lower part of the face, the hair and so on.  

Feature learning is the ability to recognize relevant patterns in data and then use those patterns to make predictions. In the glasses example, the network learns to pay attention to the upper part of the face. In the new Science paper, the researchers identified a statistical formula that describes how the neural networks are learning features. 

Alternative neural network architectures: The researchers went on to show that inserting this formula into computing systems that do not rely on neural networks allowed these systems to learn faster and more efficiently.  

“How do I ignore what’s not necessary? Humans are good at this,” said Belkin. “Machines are doing the same thing. Large Language Models, for example, are implementing this ‘selective paying attention’ and we haven’t known how they do it. In our Science paper, we present a mechanism explaining at least some of how the neural nets are ‘selectively paying attention.’” 

Study funders included the National Science Foundation and the Simons Foundation for the Collaboration on the Theoretical Foundations of Deep Learning. Belkin is part of NSF-funded and UC San Diego-led The Institute for Learning-enabled Optimization at Scale, or TILOS. 

Paper title: Mechanism for feature learning in neural networks and backpropagation-free machine learning models

Adit Radhakrishnan, Harvard School of Engineering and Applied Sciences and Broad Institute of MIT and Harvard

Daniel Beaglehole and Mikhail Belkin, University of California San Diego

Parthe Pandit: IIT Bombay–Pandit did the work for this paper as a postdoctoral researcher at the UC San Diego Halicioglu Data Science Institute

 

 


 

END


ELSE PRESS RELEASES FROM THIS DATE:

Vaccine monitoring crucial as SARS-CoV-2 variants continue to evolve

2024-03-12
Researchers at the Francis Crick Institute and the National Institute for Health and Care Research Biomedical Research Centre at UCLH have highlighted the importance of continued surveillance of emerging SARS-CoV-2 variants and vaccine performance as the virus continues to evolve. Published today as a research letter in The Lancet, their study compared the newer monovalent COVID vaccine, which specifically targets the XBB variant of Omicron (as recommended by the World Health Organisation), with older bivalent vaccines containing a mix of an Omicron variant and the original strain of COVID-19, which ...

Q&A: How Instagram influencers profit from anti-vaccine misinformation

2024-03-12
While Instagram might have a reputation for superficiality — a realm of exquisitely filtered images — it is now eclipsing other social media as a news source. The platform is increasingly filled with information, some of it pernicious and distributed via influencers. Researchers at the University of Washington studied three prominent Instagram influencers spreading anti-vaccine misinformation as a route to profit. Each account occupies what lead author Rachel E. Moran, a UW senior research scientist at the Center for an Informed Public (CIP) and staff researcher in the ...

Nancy Brown recognized as one of Modern Healthcare’s ‘Top Women Leaders’

2024-03-12
DALLAS, March 11, 2024 — Nancy Brown, Chief Executive Officer of the American Heart Association, a global force for healthier lives for all and celebrating one hundred years of lifesaving service, has been recognized by Modern Healthcare as one of the Top Women Leaders for 2024. The publication’s recognition program acknowledges and honors women executives from all sectors of the healthcare industry for their contributions to care delivery improvement, health equity, policy and gender equity in healthcare leadership. Since 2008, Brown has served as CEO of the Association, which is celebrating its 100th birthday in 2024. Awardees were selected based ...

India's water problems set to get worse as the world warms

2024-03-12
Winter storms that provide crucial snow and rainfall to northern India are arriving significantly later in the year compared to 70 years ago, a new study has found, exacerbating the risk of catastrophic flooding while also reducing vital water supplies for millions of Indians. The cyclonic storms, known as western disturbances, typically bring heavy snow to the Himalayas from December to March. This snowpack slowly melts in spring, providing a steady supply of irrigation water for wheat and other crops downstream. The study, published today (Tuesday, 12 March 2024), in the journal Weather and ...

GPS nanoparticle platform precisely delivers therapeutic payload to cancer cells

GPS nanoparticle platform precisely delivers therapeutic payload to cancer cells
2024-03-11
UNIVERSITY PARK, Pa. — A newly developed “GPS nanoparticle” injected intravenously can home in on cancer cells to deliver a genetic punch to the protein implicated in tumor growth and spread, according to researchers from Penn State. They tested their approach in human cell lines and in mice to effectively knock down a cancer-causing gene, reporting that the technique may potentially offer a more precise and effective treatment for notoriously hard-to-treat basal-like breast cancers. They published their work today (March 11) in ACS Nano. They also filed a provisional application to patent the technology ...

New method for triggering and imaging seizures can help guide epilepsy surgery

New method for triggering and imaging seizures can help guide epilepsy surgery
2024-03-11
Researchers have developed a new method for triggering and imaging seizures in epilepsy patients, offering physicians the ability to collect real-time data to tailor epilepsy surgery. In contrast to previous practice, where physicians from neurology and nuclear medicine had to wait for hours to days in hopes of capturing the onset of a seizure, the new method is convenient, spares resources, and is clinically feasible. This research was published in the March issue of The Journal of Nuclear Medicine. People with epilepsy and seizures who do not respond to medication are often helped by brain surgery. The goal ...

Giving particle detectors a boost

Giving particle detectors a boost
2024-03-11
Device could help facilitate the operation of new particle colliders, such as the Electron-Ion Collider. In particle colliders that reveal the hidden secrets of the tiniest constituents of our universe, minute particles leave behind extremely faint electrical traces when they are generated in enormous collisions. Some detectors in these facilities use superconductivity — a phenomenon in which electricity is carried with zero resistance at low temperatures — to function. For scientists to more accurately observe the behavior of these particles, these weak electrical signals, or currents, need ...

Aging at AACR Annual Meeting 2024

Aging at AACR Annual Meeting 2024
2024-03-11
BUFFALO, NY- March 11, 2024 – Impact Journals publishes scholarly journals in the biomedical sciences with a focus on all areas of cancer and aging research. Aging is one of the most prominent journals published by Impact Journals.  Impact Journals will be participating as an exhibitor at the American Association for Cancer Research (AACR) Annual Meeting 2024 from April 5-10 at the San Diego Convention Center in San Diego, California. This year, the AACR meeting theme is “Inspiring Science • Fueling ...

Oncotarget at AACR Annual Meeting 2024

Oncotarget at AACR Annual Meeting 2024
2024-03-11
BUFFALO, NY- March 11, 2024 – Impact Journals publishes scholarly journals in the biomedical sciences with a focus on all areas of cancer and aging research. Oncotarget is one of the most prominent journals published by Impact Journals.  Impact Journals will be participating as an exhibitor at the American Association for Cancer Research (AACR) Annual Meeting 2024 from April 5-10 at the San Diego Convention Center in San Diego, California. This year, the AACR meeting theme is “Inspiring Science • Fueling Progress • Revolutionizing Care.” Visit booth #4159 at the AACR Annual Meeting 2024 to connect with members of ...

Analysis reveals long-term impact of calcium and vitamin D supplements on health in postmenopausal women

2024-03-11
Embargoed for release until 5:00 p.m. ET on Monday 11 March 2024    Annals of Internal Medicine Tip Sheet     @Annalsofim    Below please find summaries of new articles that will be published in the next issue of Annals of Internal Medicine. The summaries are not intended to substitute for the full articles as a source of information. This information is under strict embargo and by taking it into possession, media representatives are committing to the terms of the embargo not only on their own behalf, but also ...

LAST 30 PRESS RELEASES:

Hormone therapy affects the metabolic health of transgender individuals

Survey of 12 European countries reveals the best and worst for smoke-free homes

First new treatment for asthma attacks in 50 years

Certain HRT tablets linked to increased heart disease and blood clot risk

Talking therapy and rehabilitation probably improve long covid symptoms, but effects modest

Ban medical research with links to the fossil fuel industry, say experts

Different menopausal hormone treatments pose different risks

Novel CAR T cell therapy obe-cel demonstrates high response rates in adult patients with advanced B-cell ALL

Clinical trial at Emory University reveals twice-yearly injection to be 96% effective in HIV prevention

Discovering the traits of extinct birds

Are health care disparities tied to worse outcomes for kids with MS?

For those with CTE, family history of mental illness tied to aggression in middle age

The sound of traffic increases stress and anxiety

Global food yields have grown steadily during last six decades

Children who grow up with pets or on farms may develop allergies at lower rates because their gut microbiome develops with more anaerobic commensals, per fecal analysis in small cohort study

North American Early Paleoindians almost 13,000 years ago used the bones of canids, felids, and hares to create needles in modern-day Wyoming, potentially to make the tailored fur garments which enabl

Higher levels of democracy and lower levels of corruption are associated with more doctors, independent of healthcare spending, per cross-sectional study of 134 countries

In major materials breakthrough, UVA team solves a nearly 200-year-old challenge in polymers

Wyoming research shows early North Americans made needles from fur-bearers

Preclinical tests show mRNA-based treatments effective for blinding condition

Velcro DNA helps build nanorobotic Meccano

Oceans emit sulfur and cool the climate more than previously thought

Nanorobot hand made of DNA grabs viruses for diagnostics and blocks cell entry

Rare, mysterious brain malformations in children linked to protein misfolding, study finds

Newly designed nanomaterial shows promise as antimicrobial agent

Scientists glue two proteins together, driving cancer cells to self-destruct

Intervention improves the healthcare response to domestic violence in low- and middle-income countries

State-wide center for quantum science: Karlsruhe Institute of Technology joins IQST as a new partner

Cellular traffic congestion in chronic diseases suggests new therapeutic targets

Cervical cancer mortality among US women younger than age 25

[Press-News.org] How do neural networks learn? A mathematical formula explains how they detect relevant patterns
The insights, published in the journal Science, can also be used to make other types of machine learning architectures more effective