(Press-News.org) As you scroll through any social media feed, you are likely to be prompted to follow or friend another person, expanding your personal network and contributing to the growth of the app itself. The person suggested to you is a result of link prediction: a widespread machine learning (ML) task that evaluates the links in a network — your friends and everyone else’s — and tries to predict what the next links will be.
Beyond being the engine that drives social media expansion, link prediction is also used in a wide range of scientific research, such as predicting the interaction between genes and proteins, and is used by researchers as a benchmark for testing the performance of new ML algorithms.
New research from UC Santa Cruz Professor of Computer Science and Engineering C. “Sesh” Seshadhri published in the journal Proceedings of the National Academy of Sciences establishes that the metric used to measure link prediction performance is missing crucial information, and link prediction tasks are performing significantly worse than popular literature indicates.
Seshadhri and his coauthor Nicolas Menand, who is a former UCSC undergraduate and masters student and a current Ph.D. candidate at the University of Pennsylvania, recommend that ML researchers stop using the standard practice metric for measuring link prediction, known as AUC, and introduce a new, more comprehensive metric for this problem. The research has implications for trustworthiness around decisionmaking in ML.
AUC’s ineffectiveness
Seshadhri, who works in the fields of theoretical computer science and data mining and is currently an Amazon scholar, has done previous research on ML algorithms for networks. In this previous work he found certain mathematical limitations that were negatively impacting algorithm performance, and in an effort to better understand the mathematical limitations in context, dove deeper into link prediction due to its importance as a testbed problem for ML algorithms.
‘“The reason why we got interested is because link prediction is one of these really important scientific tasks which is used to benchmark a lot of machine learning algorithms,” Seshadhri said. “What we were seeing was that the performance seemed to be really good… but we had an inkling that there seemed to be something off with this measurement. It feels like if you measured things in a different way, maybe you wouldn’t see such great results.”
Link prediction is based on the ML algorithm’s ability to carry out low dimensional vector embeddings, the process by which the algorithm represents the people within a network as a mathematical vector in space. All of the machine learning occurs as mathematical manipulations to those vectors.
AUC, which stands for “area under curve” and is the most common metric for measuring link prediction, gives ML algorithms a score from zero to one based on the algorithm's performance.
In their research, the authors discovered that there are fundamental mathematical limitations to using low dimensional embeddings for link predictions, and that AUC can not measure these limitations. The inability to measure these limitations caused the authors to conclude that AUC does not accurately measure link prediction performance.
Seshadhri said these results call into question the widespread use of low dimensional vector embeddings in the ML field, considering the mathematical limitations that his research has surfaced on their performance.
Leading methods fall short
The discovery of AUC’s shortcomings led the researchers to create a new metric to better capture the limitations, which they call VCMPR. They used VCMPR to measure 12 ML algorithms chosen to be representative of the field, including algorithms such as DeepWalk, Node2vec, NetMF, GraphSage, and graph benchmark leader HOP-Rec, and found that the link prediction performance was worse using VCMPR as the metric rather than AUC.
“When we look at the VCMPR scores, we see that the scores of most of the leading methods out there are really poor,” Seshadhri said. “It looks like they're actually not doing a good job when you measure things a different way.”
The results also showed that not only was performance lower across the board, some of the algorithms that performed worse than other algorithms when measured with AUC in turn performed better than the cohort with VCMPR, and vice versa.
Trustworthiness in machine learning
Seshadhri suggests that ML researchers use VCMPR to benchmark the link prediction performance of their algorithms, or at the very least stop using AUC as their measure. As metrics are so tightly connected to decision making in ML, using a flawed system to measure performance could lead to flawed decision making about which algorithms to employ in real world ML applications.
“Metrics are so closely tied to what we decide to deploy in the real world — people need to have some trust in that. If you have the wrong way of measuring, how can you trust the results?” Seshadri said. “This paper is in some sense cautionary: we have to be more careful about how we do our machine learning experiments, and we need to come up with a richer set of measures.”
In academia, using an accurate metric is crucial to creating progress in the ML field.
“This is in some sense a bit of a conundrum for scientific progress. A new result has to supposedly be better than everything previously, otherwise it's not doing anything new — but that all depends on how you measure it.”
Beyond machine learning, there are researchers across a wide range of fields who use link prediction and ML to conduct their research, often with profound potential impact. For example, some biologists use link prediction to determine which proteins are likely to interact as a part of drug discovery. These biologists and other researchers outside of ML depend on the ML experts to create trustworthy tools, as they often cannot become ML experts themselves.
While he thinks these results may not be a huge surprise to those deeply involved in the field, he hopes that the larger community of ML researchers, and particularly graduate and Ph.D. students who use the current literature to learn best practices and common wisdom about the field, will take note of these results and take caution in their work. He sees this research that presents a skeptical view to be in somewhat contrast to a dominant philosophy in ML, which tends to accept a set of metrics and focuses on “pushing the bar” when it comes to progress in the field.
“It's important that we have the skeptical view, are trying to understand deeper, and are constantly asking ourselves ‘Are we measuring things correctly?’”
This research was funded by the National Science Foundation and the Army Research Office.
END
Widespread machine learning methods behind ‘link prediction’ are performing very poorly
New research indicates that methods used to test the accuracy of link prediction are flawed, and that link prediction does not work as well as common benchmarking tests currently indicate
2024-02-12
ELSE PRESS RELEASES FROM THIS DATE:
The hidden rule for flight feathers—and how it could reveal which dinosaurs could fly
2024-02-12
Birds can fly— at least, most of them can. Flightless birds like penguins and ostriches have evolved lifestyles that don’t require flight. However, there’s a lot that scientists don’t know about how the wings and feathers of flightless birds differ from their airborne cousins. In a new study in the journal PNAS, scientists examined hundreds of birds in museum collections and discovered a suite of feather characteristics that all flying birds have in common. These “rules” provide clues as to how the dinosaur ancestors of modern birds first evolved the ability to fly, ...
Machine learning promises to accelerate metabolism research
2024-02-12
A new study shows that it is possible to use machine learning and statistics to address a problem that has long hindered the field of metabolomics: large variations in the data collected at different sites.
“We don’t always know the source of the variation,” said Daniel Raftery, professor of anesthesiology and pain medicine at the University of Washington School of Medicine in Seattle. “It could be because the subjects are different with different genetics, diets and environmental exposures. Or it could be the way samples were collected and ...
Researchers uncover a key link in legume plant-bacteria symbiosis
2024-02-12
Legume plants have the unique ability to interact with nitrogen-fixing bacteria in the soil, known as rhizobia. Legumes and rhizobia engage in symbiotic relations upon nitrogen starvation, allowing the plant to thrive without the need for externally supplied nitrogen. Symbiotic nodules are formed on the root of the plant, which are readily colonized by nitrogen-fixing bacteria. The cell-surface receptor SYMRK (symbiosis receptor-like kinase) is responsible for mediating the symbiotic signal from rhizobia perception to formation of the nodule. ...
Genetic analysis and archaeological insight combine to reveal the ancient origins of the fallow deer
2024-02-12
Modern populations of fallow deer possess hidden cultural histories dating back to the Roman Empire which ought to be factored into decisions around their management and conservation.
New research, bringing together DNA analysis with archaeological insights, has revealed how fallow deer have been repeatedly moved to new territories by humans, often as a symbol of colonial power or because of ancient cultures and religions.
The results show that the animal was first introduced into Britain by the Romans ...
Researchers studying ocean transform faults, describe a previously unknown part of the geological carbon cycle
2024-02-12
Woods Hole, Mass. (February 12, 2024) – Studying a rock is like reading a book. The rock has a story to tell, says Frieder Klein, an associate scientist in the Marine Chemistry & Geochemistry Department at the Woods Hole Oceanographic Institution (WHOI).
The rocks that Klein and his colleagues analyzed from the submerged flanks of the St. Peter and St. Paul Archipelago in the St. Paul’s oceanic transform fault, about 500 km off the coast of Brazil, tells a fascinating and previously unknown story about parts of the geological ...
Salt substitutes help to maintain healthy blood pressure in older adults
2024-02-12
The replacement of regular salt with a salt substitute can reduce incidences of hypertension, or high blood pressure, in older adults without increasing their risk of low blood pressure episodes, according to a recent study in the Journal of the American College of Cardiology. People who used a salt substitute had a 40% lower incidence and likelihood of experiencing hypertension compared to those who used regular salt.
According to the World Health Organization, hypertension is the leading risk factor for cardiovascular disease and mortality. It affects over 1.4 billion adults and results in 10.8 million deaths per year worldwide. One of the ...
Heart disease risk factors in women highlight need for increased awareness, prevention
2024-02-12
Statement Highlights:
The new scientific statement highlights heart disease as the leading cause of death for women and emerging evidence that has identified several gender-specific risk factors for heart disease in women, including complications during pregnancy and premature menopause.
Compared to men, women also have different symptoms of heart disease, are less likely to receive evidence-based therapies and are more likely to have adverse cardiovascular outcomes after a cardiac event.
Targeted public health interventions ...
SETI institute employs SETI ellipsoid technique for searching for signals from distant civilizations
2024-02-12
February 12, 2024, Mountain View, CA -- In a paper published in the Astronomical Journal, a team of researchers from the SETI Institute, Berkeley SETI Research Center and the University of Washington reported an exciting development for the field of astrophysics and the search for extraterrestrial intelligence (SETI), using observations from the Transiting Exoplanet Survey Satellite (TESS) mission to monitor the SETI Ellipsoid, a method for identifying potential signals from advanced civilizations in the ...
Sugar-reduced chocolate with oat flour just as tasty as original, study finds
2024-02-12
UNIVERSITY PARK, Pa. — The secret to making delicious chocolate with less added sugar is oat flour, according to a new study by Penn State researchers. In a blind taste test, recently published in the Journal of Food Science, 25% reduced-sugar chocolates made with oat flour were rated equally, and in some cases preferred, to regular chocolate. The findings provide a new option for decreasing chocolate’s sugar content while maintaining its texture and flavor.
“We were able to show that there is a range in which you can ...
AI-supported image analysis: metrics determine quality
2024-02-12
How well do the algorithms used in the AI-supported analysis of medical images perform their respective tasks? This depends to a large extent on the metrics used to evaluate their performance. An international consortium led by scientists from the German Cancer Research Center (DKFZ) and the National Center for Tumor Diseases (NCT) in Heidelberg has compiled the knowledge available worldwide on the specific strengths, weaknesses and limitations of the various validation metrics. With "Metrics Reloaded", the researchers are ...
LAST 30 PRESS RELEASES:
A new approach to predicting malaria drug resistance
Coral adaptation unlikely to keep pace with global warming
Bioinspired droplet-based systems herald a new era in biocompatible devices
A fossil first: Scientists find 1.5-million-year-old footprints of two different species of human ancestors at same spot
The key to “climate smart” agriculture might be through its value chain
These hibernating squirrels could use a drink—but don’t feel the thirst
New footprints offer evidence of co-existing hominid species 1.5 million years ago
Moral outrage helps misinformation spread through social media
U-M, multinational team of scientists reveal structural link for initiation of protein synthesis in bacteria
New paper calls for harnessing agrifood value chains to help farmers be climate-smart
Preschool education: A key to supporting allophone children
CNIC scientists discover a key mechanism in fat cells that protects the body against energetic excess
Chemical replacement of TNT explosive more harmful to plants, study shows
Scientists reveal possible role of iron sulfides in creating life in terrestrial hot springs
Hormone therapy affects the metabolic health of transgender individuals
Survey of 12 European countries reveals the best and worst for smoke-free homes
First new treatment for asthma attacks in 50 years
Certain HRT tablets linked to increased heart disease and blood clot risk
Talking therapy and rehabilitation probably improve long covid symptoms, but effects modest
Ban medical research with links to the fossil fuel industry, say experts
Different menopausal hormone treatments pose different risks
Novel CAR T cell therapy obe-cel demonstrates high response rates in adult patients with advanced B-cell ALL
Clinical trial at Emory University reveals twice-yearly injection to be 96% effective in HIV prevention
Discovering the traits of extinct birds
Are health care disparities tied to worse outcomes for kids with MS?
For those with CTE, family history of mental illness tied to aggression in middle age
The sound of traffic increases stress and anxiety
Global food yields have grown steadily during last six decades
Children who grow up with pets or on farms may develop allergies at lower rates because their gut microbiome develops with more anaerobic commensals, per fecal analysis in small cohort study
North American Early Paleoindians almost 13,000 years ago used the bones of canids, felids, and hares to create needles in modern-day Wyoming, potentially to make the tailored fur garments which enabl
[Press-News.org] Widespread machine learning methods behind ‘link prediction’ are performing very poorlyNew research indicates that methods used to test the accuracy of link prediction are flawed, and that link prediction does not work as well as common benchmarking tests currently indicate