PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

Widespread machine learning methods behind ‘link prediction’ are performing very poorly

New research indicates that methods used to test the accuracy of link prediction are flawed, and that link prediction does not work as well as common benchmarking tests currently indicate

2024-02-12
(Press-News.org) As you scroll through any social media feed, you are likely to be prompted to follow or friend another person, expanding your personal network and contributing to the growth of the app itself. The person suggested to you is a result of link prediction: a widespread machine learning (ML) task that evaluates the links in a network — your friends and everyone else’s — and tries to predict what the next links will be.

Beyond being the engine that drives social media expansion, link prediction is also used in a wide range of scientific research, such as predicting the interaction between genes and proteins, and is used by researchers as a benchmark for testing the performance of new ML algorithms. 

New research from UC Santa Cruz Professor of Computer Science and Engineering C. “Sesh” Seshadhri published in the journal Proceedings of the National Academy of Sciences establishes that the metric used to measure link prediction performance is missing crucial information, and link prediction tasks are performing significantly worse than popular literature indicates. 

Seshadhri and his coauthor Nicolas Menand, who is a former UCSC undergraduate and masters student and a current Ph.D. candidate at the University of Pennsylvania, recommend that ML researchers stop using the standard practice metric for measuring link prediction, known as AUC, and introduce a new, more comprehensive metric for this problem. The research has implications for trustworthiness around decisionmaking in ML.

AUC’s ineffectiveness

Seshadhri, who works in the fields of theoretical computer science and data mining and is currently an Amazon scholar, has done previous research on ML algorithms for networks. In this previous work he found certain mathematical limitations that were negatively impacting algorithm performance, and in an effort to better understand the mathematical limitations in context, dove deeper into link prediction due to its importance as a testbed problem for ML algorithms. 

‘“The reason why we got interested is because link prediction is one of these really important scientific tasks which is used to benchmark a lot of machine learning algorithms,” Seshadhri said. “What we were seeing was that the performance seemed to be really good… but we had an inkling that there seemed to be something off with this measurement. It feels like if you measured things in a different way, maybe you wouldn’t see such great results.”

Link prediction is based on the ML algorithm’s ability to carry out low dimensional vector embeddings, the process by which the algorithm represents the people within a network as a mathematical vector in space. All of the machine learning occurs as mathematical manipulations to those vectors. 

AUC, which stands for “area under curve” and is the most common metric for measuring link prediction, gives ML algorithms a score from zero to one based on the algorithm's performance. 

In their research, the authors discovered that there are fundamental mathematical limitations to using low dimensional embeddings for link predictions, and that AUC can not measure these limitations. The inability to measure these limitations caused the authors to conclude that AUC does not accurately measure link prediction performance.

Seshadhri said these results call into question the widespread use of low dimensional vector embeddings in the ML field, considering the mathematical limitations that his research has surfaced on their performance. 

Leading methods fall short 

The discovery of AUC’s shortcomings led the researchers to create a new metric to better capture the limitations, which they call VCMPR. They used VCMPR to measure 12 ML algorithms chosen to be representative of the field, including algorithms such as DeepWalk, Node2vec, NetMF, GraphSage, and graph benchmark leader HOP-Rec, and found that the link prediction performance was worse using VCMPR as the metric rather than AUC. 

“When we look at the VCMPR scores, we see that the scores of most of the leading methods out there are really poor,” Seshadhri said. “It looks like they're actually not doing a good job when you measure things a different way.” 

The results also showed that not only was performance lower across the board, some of the algorithms that performed worse than other algorithms when measured with AUC in turn performed better than the cohort with VCMPR, and vice versa. 

Trustworthiness in machine learning  

Seshadhri suggests that ML researchers use VCMPR to benchmark the link prediction performance of their algorithms, or at the very least stop using AUC as their measure. As metrics are so tightly connected to decision making in ML, using a flawed system to measure performance could lead to flawed decision making about which algorithms to employ in real world ML applications. 

“Metrics are so closely tied to what we decide to deploy in the real world — people need to have some trust in that. If you have the wrong way of measuring, how can you trust the results?” Seshadri said. “This paper is in some sense cautionary: we have to be more careful about how we do our machine learning experiments, and we need to come up with a richer set of measures.” 

In academia, using an accurate metric is crucial to creating progress in the ML field.

“This is in some sense a bit of a conundrum for scientific progress. A new result has to supposedly be better than everything previously, otherwise it's not doing anything new — but that all depends on how you measure it.” 

Beyond machine learning, there are researchers across a wide range of fields who use link prediction and ML to conduct their research, often with profound potential impact. For example, some biologists use link prediction to determine which proteins are likely to interact as a part of drug discovery. These biologists and other researchers outside of ML depend on the ML experts to create trustworthy tools, as they often cannot become ML experts themselves. 

While he thinks these results may not be a huge surprise to those deeply involved in the field, he hopes that the larger community of ML researchers, and particularly graduate and Ph.D. students who use the current literature to learn best practices and common wisdom about the field, will take note of these results and take caution in their work. He sees this research that presents a skeptical view to be in somewhat contrast to a dominant philosophy in ML, which tends to accept a set of metrics and focuses on “pushing the bar” when it comes to progress in the field.

“It's important that we have the skeptical view, are trying to understand deeper, and are constantly asking ourselves ‘Are we measuring things correctly?’”

This research was funded by the National Science Foundation and the Army Research Office.

END


ELSE PRESS RELEASES FROM THIS DATE:

The hidden rule for flight feathers—and how it could reveal which dinosaurs could fly

The hidden rule for flight feathers—and how it could reveal which dinosaurs could fly
2024-02-12
Birds can fly— at least, most of them can. Flightless birds like penguins and ostriches have evolved lifestyles that don’t require flight. However, there’s a lot that scientists don’t know about how the wings and feathers of flightless birds differ from their airborne cousins. In a new study in the journal PNAS, scientists examined hundreds of birds in museum collections and discovered a suite of feather characteristics that all flying birds have in common. These “rules” provide clues as to how the dinosaur ancestors of modern birds first evolved the ability to fly, ...

Machine learning promises to accelerate metabolism research

Machine learning promises to accelerate metabolism research
2024-02-12
A new study shows that it is possible to use machine learning and statistics to address a problem that has long hindered the field of metabolomics: large variations in the data collected at different sites. “We don’t always know the source of the variation,” said Daniel Raftery, professor of anesthesiology and pain medicine at the University of Washington School of Medicine in Seattle. “It could be because the subjects are different with different genetics, diets and environmental exposures. Or it could be the way samples were collected and ...

Researchers uncover a key link in legume plant-bacteria symbiosis

Researchers uncover a key link in legume plant-bacteria symbiosis
2024-02-12
Legume plants have the unique ability to interact with nitrogen-fixing bacteria in the soil, known as rhizobia. Legumes and rhizobia engage in symbiotic relations upon nitrogen starvation, allowing the plant to thrive without the need for externally supplied nitrogen. Symbiotic nodules are formed on the root of the plant, which are readily colonized by nitrogen-fixing bacteria. The cell-surface receptor SYMRK (symbiosis receptor-like kinase) is responsible for mediating the symbiotic signal from rhizobia perception to formation of the nodule. ...

Genetic analysis and archaeological insight combine to reveal the ancient origins of the fallow deer

Genetic analysis and archaeological insight combine to reveal the ancient origins of the fallow deer
2024-02-12
Modern populations of fallow deer possess hidden cultural histories dating back to the Roman Empire which ought to be factored into decisions around their management and conservation. New research, bringing together DNA analysis with archaeological insights, has revealed how fallow deer have been repeatedly moved to new territories by humans, often as a symbol of colonial power or because of ancient cultures and religions. The results show that the animal was first introduced into Britain by the Romans ...

Researchers studying ocean transform faults, describe a previously unknown part of the geological carbon cycle

Researchers studying ocean transform faults, describe a previously unknown part of the geological carbon cycle
2024-02-12
Woods Hole, Mass. (February 12, 2024) – Studying a rock is like reading a book. The rock has a story to tell, says Frieder Klein, an associate scientist in the Marine Chemistry & Geochemistry Department at the Woods Hole Oceanographic Institution (WHOI).    The rocks that Klein and his colleagues analyzed from the submerged flanks of the St. Peter and St. Paul Archipelago in the St. Paul’s oceanic transform fault, about 500 km off the coast of Brazil, tells a fascinating and previously unknown story about parts of the geological ...

Salt substitutes help to maintain healthy blood pressure in older adults

2024-02-12
The replacement of regular salt with a salt substitute can reduce incidences of hypertension, or high blood pressure, in older adults without increasing their risk of low blood pressure episodes, according to a recent study in the Journal of the American College of Cardiology. People who used a salt substitute had a 40% lower incidence and likelihood of experiencing hypertension compared to those who used regular salt. According to the World Health Organization, hypertension is the leading risk factor for cardiovascular disease and mortality. It affects over 1.4 billion adults and results in 10.8 million deaths per year worldwide. One of the ...

Heart disease risk factors in women highlight need for increased awareness, prevention

2024-02-12
Statement Highlights: The new scientific statement highlights heart disease as the leading cause of death for women and emerging evidence that has identified several gender-specific risk factors for heart disease in women, including complications during pregnancy and premature menopause. Compared to men, women also have different symptoms of heart disease, are less likely to receive evidence-based therapies and are more likely to have adverse cardiovascular outcomes after a cardiac event. Targeted public health interventions ...

SETI institute employs SETI ellipsoid technique for searching for signals from distant civilizations

SETI institute employs SETI ellipsoid technique for searching for signals from distant civilizations
2024-02-12
February 12, 2024, Mountain View, CA -- In a paper published in the Astronomical Journal, a team of researchers from the SETI Institute, Berkeley SETI Research Center and the University of Washington reported an exciting development for the field of astrophysics and the search for extraterrestrial intelligence (SETI), using observations from the Transiting Exoplanet Survey Satellite (TESS) mission to monitor the SETI Ellipsoid, a method for identifying potential signals from advanced civilizations in the ...

Sugar-reduced chocolate with oat flour just as tasty as original, study finds

Sugar-reduced chocolate with oat flour just as tasty as original, study finds
2024-02-12
UNIVERSITY PARK, Pa. — The secret to making delicious chocolate with less added sugar is oat flour, according to a new study by Penn State researchers. In a blind taste test, recently published in the Journal of Food Science, 25% reduced-sugar chocolates made with oat flour were rated equally, and in some cases preferred, to regular chocolate. The findings provide a new option for decreasing chocolate’s sugar content while maintaining its texture and flavor. “We were able to show that there is a range in which you can ...

AI-supported image analysis: metrics determine quality

2024-02-12
How well do the algorithms used in the AI-supported analysis of medical images perform their respective tasks? This depends to a large extent on the metrics used to evaluate their performance. An international consortium led by scientists from the German Cancer Research Center (DKFZ) and the National Center for Tumor Diseases (NCT) in Heidelberg has compiled the knowledge available worldwide on the specific strengths, weaknesses and limitations of the various validation metrics. With "Metrics Reloaded", the researchers are ...

LAST 30 PRESS RELEASES:

Insulin resistance is linked to over 30 diseases – and to early death in women, study of people in the UK finds

Innovative semaglutide hydrogel could reduce diabetes shots to once a month

Weight loss could reduce the risk of severe infections in people with diabetes, UK research suggests

Long-term exposure to air pollution and a lack of green space increases the risk of hospitalization for respiratory conditions

Better cardiovascular health in early pregnancy may offset high genetic risk

Artificial intelligence method transforms gene mutation prediction in lung cancer: DeepGEM data releases at IASLC 2024 World Conference on Lung Cancer

Antibody–drug conjugate I-DXd shows clinically meaningful response in patients with extensive-stage small cell lung cancer

IASLC Global Survey on biomarker testing reveals progress and persistent barriers in lung cancer biomarker testing

Research shows pathway to developing predictive biomarkers for immune checkpoint inhibitors

Just how dangerous is Great Salt Lake dust? New research looks for clues

Maroulas appointed Associate Vice Chancellor, Director of AI Tennessee

New chickadee research finds cognitive skills impact lifespan

Cognitive behavioral therapy enhances brain circuits to relieve depression

Terasaki Institute awarded $2.3 Million grant from NIH for organ transplantation research using organs-on-a-chip technology

Atoms on the edge

Postdoc takes multipronged approach to muon detection

Mathematical proof: Five satellites needed for precise navigation

Scalable, multi-functional device lays groundwork for advanced quantum applications

Falling for financial scams? It may signal early Alzheimer’s disease

Integrating MRI and OCT for new insights into brain microstructure

Designing a normative neuroimaging library to support diagnosis of traumatic brain injury

Department of Energy announces $68 million in funding for artificial intelligence for scientific research

DOE, ORNL announce opportunity to define future of high-performance computing

Molecular simulations, supercomputing lead to energy-saving biomaterials breakthrough

Low-impact yoga and exercise found to help older women manage urinary incontinence

Genetic studies reveal new insights into cognitive impairment in schizophrenia

Researcher develops technology to provide cleaner energy and cleaner water

Expect the unexpected: nanoscale silver unveils intrinsic self-healing abilities

nTIDE September 2024 Jobs Report: Gains in employment for people with disabilities appear to level off after reducing gaps with non-disabled workers

Wiley enhances NMR Spectral Library Collection with extensive new databases

[Press-News.org] Widespread machine learning methods behind ‘link prediction’ are performing very poorly
New research indicates that methods used to test the accuracy of link prediction are flawed, and that link prediction does not work as well as common benchmarking tests currently indicate