PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

Widespread machine learning methods behind ‘link prediction’ are performing very poorly

New research indicates that methods used to test the accuracy of link prediction are flawed, and that link prediction does not work as well as common benchmarking tests currently indicate

2024-02-12
(Press-News.org) As you scroll through any social media feed, you are likely to be prompted to follow or friend another person, expanding your personal network and contributing to the growth of the app itself. The person suggested to you is a result of link prediction: a widespread machine learning (ML) task that evaluates the links in a network — your friends and everyone else’s — and tries to predict what the next links will be.

Beyond being the engine that drives social media expansion, link prediction is also used in a wide range of scientific research, such as predicting the interaction between genes and proteins, and is used by researchers as a benchmark for testing the performance of new ML algorithms. 

New research from UC Santa Cruz Professor of Computer Science and Engineering C. “Sesh” Seshadhri published in the journal Proceedings of the National Academy of Sciences establishes that the metric used to measure link prediction performance is missing crucial information, and link prediction tasks are performing significantly worse than popular literature indicates. 

Seshadhri and his coauthor Nicolas Menand, who is a former UCSC undergraduate and masters student and a current Ph.D. candidate at the University of Pennsylvania, recommend that ML researchers stop using the standard practice metric for measuring link prediction, known as AUC, and introduce a new, more comprehensive metric for this problem. The research has implications for trustworthiness around decisionmaking in ML.

AUC’s ineffectiveness

Seshadhri, who works in the fields of theoretical computer science and data mining and is currently an Amazon scholar, has done previous research on ML algorithms for networks. In this previous work he found certain mathematical limitations that were negatively impacting algorithm performance, and in an effort to better understand the mathematical limitations in context, dove deeper into link prediction due to its importance as a testbed problem for ML algorithms. 

‘“The reason why we got interested is because link prediction is one of these really important scientific tasks which is used to benchmark a lot of machine learning algorithms,” Seshadhri said. “What we were seeing was that the performance seemed to be really good… but we had an inkling that there seemed to be something off with this measurement. It feels like if you measured things in a different way, maybe you wouldn’t see such great results.”

Link prediction is based on the ML algorithm’s ability to carry out low dimensional vector embeddings, the process by which the algorithm represents the people within a network as a mathematical vector in space. All of the machine learning occurs as mathematical manipulations to those vectors. 

AUC, which stands for “area under curve” and is the most common metric for measuring link prediction, gives ML algorithms a score from zero to one based on the algorithm's performance. 

In their research, the authors discovered that there are fundamental mathematical limitations to using low dimensional embeddings for link predictions, and that AUC can not measure these limitations. The inability to measure these limitations caused the authors to conclude that AUC does not accurately measure link prediction performance.

Seshadhri said these results call into question the widespread use of low dimensional vector embeddings in the ML field, considering the mathematical limitations that his research has surfaced on their performance. 

Leading methods fall short 

The discovery of AUC’s shortcomings led the researchers to create a new metric to better capture the limitations, which they call VCMPR. They used VCMPR to measure 12 ML algorithms chosen to be representative of the field, including algorithms such as DeepWalk, Node2vec, NetMF, GraphSage, and graph benchmark leader HOP-Rec, and found that the link prediction performance was worse using VCMPR as the metric rather than AUC. 

“When we look at the VCMPR scores, we see that the scores of most of the leading methods out there are really poor,” Seshadhri said. “It looks like they're actually not doing a good job when you measure things a different way.” 

The results also showed that not only was performance lower across the board, some of the algorithms that performed worse than other algorithms when measured with AUC in turn performed better than the cohort with VCMPR, and vice versa. 

Trustworthiness in machine learning  

Seshadhri suggests that ML researchers use VCMPR to benchmark the link prediction performance of their algorithms, or at the very least stop using AUC as their measure. As metrics are so tightly connected to decision making in ML, using a flawed system to measure performance could lead to flawed decision making about which algorithms to employ in real world ML applications. 

“Metrics are so closely tied to what we decide to deploy in the real world — people need to have some trust in that. If you have the wrong way of measuring, how can you trust the results?” Seshadri said. “This paper is in some sense cautionary: we have to be more careful about how we do our machine learning experiments, and we need to come up with a richer set of measures.” 

In academia, using an accurate metric is crucial to creating progress in the ML field.

“This is in some sense a bit of a conundrum for scientific progress. A new result has to supposedly be better than everything previously, otherwise it's not doing anything new — but that all depends on how you measure it.” 

Beyond machine learning, there are researchers across a wide range of fields who use link prediction and ML to conduct their research, often with profound potential impact. For example, some biologists use link prediction to determine which proteins are likely to interact as a part of drug discovery. These biologists and other researchers outside of ML depend on the ML experts to create trustworthy tools, as they often cannot become ML experts themselves. 

While he thinks these results may not be a huge surprise to those deeply involved in the field, he hopes that the larger community of ML researchers, and particularly graduate and Ph.D. students who use the current literature to learn best practices and common wisdom about the field, will take note of these results and take caution in their work. He sees this research that presents a skeptical view to be in somewhat contrast to a dominant philosophy in ML, which tends to accept a set of metrics and focuses on “pushing the bar” when it comes to progress in the field.

“It's important that we have the skeptical view, are trying to understand deeper, and are constantly asking ourselves ‘Are we measuring things correctly?’”

This research was funded by the National Science Foundation and the Army Research Office.

END


ELSE PRESS RELEASES FROM THIS DATE:

The hidden rule for flight feathers—and how it could reveal which dinosaurs could fly

The hidden rule for flight feathers—and how it could reveal which dinosaurs could fly
2024-02-12
Birds can fly— at least, most of them can. Flightless birds like penguins and ostriches have evolved lifestyles that don’t require flight. However, there’s a lot that scientists don’t know about how the wings and feathers of flightless birds differ from their airborne cousins. In a new study in the journal PNAS, scientists examined hundreds of birds in museum collections and discovered a suite of feather characteristics that all flying birds have in common. These “rules” provide clues as to how the dinosaur ancestors of modern birds first evolved the ability to fly, ...

Machine learning promises to accelerate metabolism research

Machine learning promises to accelerate metabolism research
2024-02-12
A new study shows that it is possible to use machine learning and statistics to address a problem that has long hindered the field of metabolomics: large variations in the data collected at different sites. “We don’t always know the source of the variation,” said Daniel Raftery, professor of anesthesiology and pain medicine at the University of Washington School of Medicine in Seattle. “It could be because the subjects are different with different genetics, diets and environmental exposures. Or it could be the way samples were collected and ...

Researchers uncover a key link in legume plant-bacteria symbiosis

Researchers uncover a key link in legume plant-bacteria symbiosis
2024-02-12
Legume plants have the unique ability to interact with nitrogen-fixing bacteria in the soil, known as rhizobia. Legumes and rhizobia engage in symbiotic relations upon nitrogen starvation, allowing the plant to thrive without the need for externally supplied nitrogen. Symbiotic nodules are formed on the root of the plant, which are readily colonized by nitrogen-fixing bacteria. The cell-surface receptor SYMRK (symbiosis receptor-like kinase) is responsible for mediating the symbiotic signal from rhizobia perception to formation of the nodule. ...

Genetic analysis and archaeological insight combine to reveal the ancient origins of the fallow deer

Genetic analysis and archaeological insight combine to reveal the ancient origins of the fallow deer
2024-02-12
Modern populations of fallow deer possess hidden cultural histories dating back to the Roman Empire which ought to be factored into decisions around their management and conservation. New research, bringing together DNA analysis with archaeological insights, has revealed how fallow deer have been repeatedly moved to new territories by humans, often as a symbol of colonial power or because of ancient cultures and religions. The results show that the animal was first introduced into Britain by the Romans ...

Researchers studying ocean transform faults, describe a previously unknown part of the geological carbon cycle

Researchers studying ocean transform faults, describe a previously unknown part of the geological carbon cycle
2024-02-12
Woods Hole, Mass. (February 12, 2024) – Studying a rock is like reading a book. The rock has a story to tell, says Frieder Klein, an associate scientist in the Marine Chemistry & Geochemistry Department at the Woods Hole Oceanographic Institution (WHOI).    The rocks that Klein and his colleagues analyzed from the submerged flanks of the St. Peter and St. Paul Archipelago in the St. Paul’s oceanic transform fault, about 500 km off the coast of Brazil, tells a fascinating and previously unknown story about parts of the geological ...

Salt substitutes help to maintain healthy blood pressure in older adults

2024-02-12
The replacement of regular salt with a salt substitute can reduce incidences of hypertension, or high blood pressure, in older adults without increasing their risk of low blood pressure episodes, according to a recent study in the Journal of the American College of Cardiology. People who used a salt substitute had a 40% lower incidence and likelihood of experiencing hypertension compared to those who used regular salt. According to the World Health Organization, hypertension is the leading risk factor for cardiovascular disease and mortality. It affects over 1.4 billion adults and results in 10.8 million deaths per year worldwide. One of the ...

Heart disease risk factors in women highlight need for increased awareness, prevention

2024-02-12
Statement Highlights: The new scientific statement highlights heart disease as the leading cause of death for women and emerging evidence that has identified several gender-specific risk factors for heart disease in women, including complications during pregnancy and premature menopause. Compared to men, women also have different symptoms of heart disease, are less likely to receive evidence-based therapies and are more likely to have adverse cardiovascular outcomes after a cardiac event. Targeted public health interventions ...

SETI institute employs SETI ellipsoid technique for searching for signals from distant civilizations

SETI institute employs SETI ellipsoid technique for searching for signals from distant civilizations
2024-02-12
February 12, 2024, Mountain View, CA -- In a paper published in the Astronomical Journal, a team of researchers from the SETI Institute, Berkeley SETI Research Center and the University of Washington reported an exciting development for the field of astrophysics and the search for extraterrestrial intelligence (SETI), using observations from the Transiting Exoplanet Survey Satellite (TESS) mission to monitor the SETI Ellipsoid, a method for identifying potential signals from advanced civilizations in the ...

Sugar-reduced chocolate with oat flour just as tasty as original, study finds

Sugar-reduced chocolate with oat flour just as tasty as original, study finds
2024-02-12
UNIVERSITY PARK, Pa. — The secret to making delicious chocolate with less added sugar is oat flour, according to a new study by Penn State researchers. In a blind taste test, recently published in the Journal of Food Science, 25% reduced-sugar chocolates made with oat flour were rated equally, and in some cases preferred, to regular chocolate. The findings provide a new option for decreasing chocolate’s sugar content while maintaining its texture and flavor. “We were able to show that there is a range in which you can ...

AI-supported image analysis: metrics determine quality

2024-02-12
How well do the algorithms used in the AI-supported analysis of medical images perform their respective tasks? This depends to a large extent on the metrics used to evaluate their performance. An international consortium led by scientists from the German Cancer Research Center (DKFZ) and the National Center for Tumor Diseases (NCT) in Heidelberg has compiled the knowledge available worldwide on the specific strengths, weaknesses and limitations of the various validation metrics. With "Metrics Reloaded", the researchers are ...

LAST 30 PRESS RELEASES:

UC San Diego Health ends negotiations with Tri-City Medical Center Healthcare District

MLB add lifesavers to the chain of survival in New York City

ISU studies explore win-win potential of grass-powered energy production

Study identifies biomarker that could predict whether colon cancer patients benefit from chemotherapy

Children are less likely to have type 1 diabetes if their mother has the condition than if their father is affected

Two shark species documented in Puget Sound for first time by Oregon State researchers

AI method radically speeds predictions of materials’ thermal properties

Study: When allocating scarce resources with AI, randomization can improve fairness

Wencai Liu earns 2024 IUPAP Early Career Scientist Prize in Mathematical Physics

Outsourcing conservation in Africa

Study finds big disparities in stroke services across the US

Media Tip Sheet: Urban Ecology at #ESA2024

Michigan Plasma prize honors University of Illinois professor

Atomic 'GPS' elucidates movement during ultrafast material transitions

UMBC scientists work to build “wind-up” sensors

Researchers receive McKnight award to study the evolution of deadly brain cancer

Heather Dyer selected as the 2024 ESA Regional Policy Award Winner

New study disputes Hunga Tonga volcano’s role in 2023-24 global warm-up

Climate is most important factor in where mammals choose to live, study finds

New study highlights global disparities in activity limitations and assistive device use

Study finds targeting inflammation may not help reduce liver fibrosis in MAFLD

Meet Insilico in Singapore: Alex Zhavoronkov PhD shares insights into various aspects of AI-powered drug discovery

Insilico Medicine introduces Science42: DORA, the intelligent writing assistant for accelerated research

A deep dive into polyimides for high-frequency wireless telecommunications

Green hydrogen from direct seawater electrolysis- experts warn against hype

Thousands of birds and fish threatened by mining for clean energy transition

Medical and educational indebtedness among health care workers

US state restrictions and excess COVID-19 pandemic deaths

Posttraumatic stress disorder among adults in communities with mass violence incidents

New understanding of fly behavior has potential application in robotics, public safety

[Press-News.org] Widespread machine learning methods behind ‘link prediction’ are performing very poorly
New research indicates that methods used to test the accuracy of link prediction are flawed, and that link prediction does not work as well as common benchmarking tests currently indicate