(Press-News.org) Over the past decade, AI has permeated nearly every corner of science: Machine learning models have been used to predict protein structures, estimate the fraction of the Amazon rainforest that has been lost to deforestation and even classify faraway galaxies that might be home to exoplanets.
But while AI can be used to speed scientific discovery — helping researchers make predictions about phenomena that may be difficult or costly to study in the real world — it can also lead scientists astray. In the same way that chatbots sometimes “hallucinate,” or make things up, machine learning models can sometimes present misleading or downright false results.
In a paper published online today (Thursday, Nov. 9) in Science, researchers at the University of California, Berkeley, present a new statistical technique for safely using the predictions obtained from machine learning models to test scientific hypotheses.
The technique, called prediction-powered inference (PPI), uses a small amount of real-world data to correct the output of large, general models — such as AlphaFold, which predicts protein structures — in the context of specific scientific questions.
“These models are meant to be general: They can answer many questions, but we don't know which questions they answer well and which questions they answer badly — and if you use them naively, without knowing which case you're in, you can get bad answers,” said study author Michael Jordan, the Pehong Chen Distinguished Professor of electrical engineering and computer science and of statistics at UC Berkeley. “With PPI, you're able to use the model, but correct for possible errors, even when you don’t know the nature of those errors at the outset.”
The risk of hidden biases
When scientists conduct experiments, they’re not just looking for a single answer — they want to obtain a range of plausible answers. This is done by calculating a “confidence interval,” which, in the simplest case, can be found by repeating an experiment many times and seeing how the results vary.
In most science studies, a confidence interval usually refers to a summary or combined statistic, not individual data points. Unfortunately, machine learning systems focus on individual data points, and thus do not provide scientists with the kinds of uncertainty assessments that they care about. For instance, AlphaFold predicts the structure of a single protein, but it doesn't provide a notion of confidence for that structure, nor a way to obtain confidence
intervals that refer to general properties of proteins.
Scientists may be tempted to use the predictions from AlphaFold as if they were data to compute classical confidence intervals, ignoring the fact that these predictions are not data. The problem with this approach is that machine learning systems have many hidden biases that can skew the results. These biases arise, in part, from the data on which they are trained, which are generally existing scientific research that may not have had the same focus as the current study.
“Indeed, in scientific problems, we're often interested in phenomena which are at the edge between the known and the unknown,” Jordan said. “Very often, there aren’t much data from the past that are at that edge, and that makes generative AI models even more likely to ‘hallucinate,’ producing output that is unrealistic.”
Calculating valid confidence intervals
PPI allows scientists to incorporate the predictions from models like AlphaFold without making any assumptions about how the model was built or the data it was trained on. To do this, PPI requires a small amount of data that is unbiased, with respect to the specific hypothesis being investigated, paired with machine learning predictions corresponding to that data. By bringing these two sources of evidence together, PPI is able to form valid confidence intervals.
For example, the research team applied the PPI technique to algorithms that can pinpoint areas of deforestation in the Amazon using satellite imagery. These models were accurate, overall, when tested individually on regions in the forest; however, when these assessments were combined to estimate deforestation across the entire Amazon, the confidence intervals became highly skewed. This is likely because the model struggled to recognize certain newer patterns of deforestation.
With PPI, the team was able to correct for the bias in the confidence interval using a small number of human-labeled regions of deforestation.
The team also showed how the technique can be applied to a variety of other research, including questions about protein folding, galaxy classification, gene expression levels, counting plankton, and the relationship between income and private health insurance.
“There’s really no limit on the type of questions that this approach could be applied to,” Jordan said. “We think that PPI is a much-needed component of modern data-intensive, model-intensive and collaborative science.”
Additional co-authors include Anastasios N. Angelopoulos, Stephen Bates, Clara Fannjiang and Tijana Zrnic of UC Berkeley. This research was supported by the Office of Naval Research (N00014-21-1-2840) and the National Science Foundation.
END
How to use AI for discovery — without leading science astray
A new statistical technique allows researchers to safely use machine learning predictions to test scientific hypotheses
2023-11-09
ELSE PRESS RELEASES FROM THIS DATE:
Ultrafast lasers on ultra-tiny chips
2023-11-09
Lasers have become relatively commonplace in everyday life, but they have many uses outside of providing light shows at raves and scanning barcodes on groceries. Lasers are also of great importance in telecommunications and computing as well as biology, chemistry, and physics research.
In those latter applications, lasers that can emit extremely short pulses—those on the order of one-trillionth of a second (one picosecond) or shorter—are especially useful. Using lasers operating on such small timescales, researchers can study physical and chemical ...
Pesticides, herbicides, fungicides detected in New York state beeswax
2023-11-09
An analysis of beeswax in managed honeybee hives in New York found a wide variety of pesticide, herbicide and fungicide residues – exposing current and future generations of bees to long-term toxicity.
The study, published in the Journal of Veterinary Diagnostic Investigation, notes that people may be similarly exposed through contaminated honey, pollen and wax in cosmetics. Though the chemicals found in wax are not beneficial to humans, the small amounts in these products are unlikely to ...
Study reveals bacterial protein capable of keeping human cells healthy
2023-11-09
Researchers at the University of São Paulo (USP) in Brazil, partnering with colleagues in Australia, have identified a novel bacterial protein that can keep human cells healthy even when the cells have a heavy bacterial burden. The discovery could lead to new treatments for a wide array of diseases relating to mitochondrial dysfunction, such as cancer and auto-immune disorders. Mitochondria are organelles that supply most of the chemical energy needed to power cells’ biochemical reactions.
An article on the study is published in the journal PNAS. The researchers ...
Endangered thick-billed parrots at risk of losing newly identified, unprotected Sierra Madre forest habitats to logging, deforestation, study shows
2023-11-09
DOWNLOAD PHOTOS AND VIDEO: https://sandiegozoo.box.com/s/x50kzaoukdtyjxsv9mzqgn0fu1m6kddk
A binational team of scientists, using creativity and innovation, adorned dozens of endangered thick-billed parrots with tiny solar-powered satellite transmitters to track and reveal their winter migratory nesting sites in the remote treetops of the Sierra Madre Occidental ranges. Their research reveals new critical habitat, 80% of which has no formal protection.
In a study published this month in the journal Global ...
Atomic dance gives rise to a magnet
2023-11-09
Quantum materials hold the key to a future of lightning-speed, energy-efficient information systems. The problem with tapping their transformative potential is that, in solids, the vast number of atoms often drowns out the exotic quantum properties electrons carry.
Rice University researchers in the lab of quantum materials scientist Hanyu Zhu found that when they move in circles, atoms can also work wonders: When the atomic lattice in a rare-earth crystal becomes animated with a corkscrew-shaped vibration known as a chiral phonon, the crystal is transformed ...
Milky Way-like galaxy found in the early universe
2023-11-09
Using the James Webb Space Telescope, an international team, including astronomer Alexander de la Vega of the University of California, Riverside, has discovered the most distant barred spiral galaxy similar to the Milky Way that has been observed to date.
Until now it was believed that barred spiral galaxies like the Milky Way could not be observed before the universe, estimated to be 13.8 billion years old, reached half of its current age.
The research, published in Nature this week, was led by scientists at the Centro de Astrobiología in Spain.
“This galaxy, named ceers-2112, formed soon after ...
Side-effect avoiding treatment shows early promise against breast cancer in mice
2023-11-09
New experimental evidence suggests that substances known as narrow-spectrum Wnt signaling inhibitors—which could have fewer side effects than other related substances—are capable of suppressing the growth of breast cancer tumors in mice. Aina He of Shanghai Jiaotong University Affiliated Sixth People’s Hospital, China, and colleagues present these findings November 9th in the open access journal PLOS Biology.
While certain subtypes of breast cancer can be targeted with special medications, others can only be treated with standard chemotherapy. For some patients, chemotherapy may lead to the growth of stem cell-like cancer cells that are drug resistant. Previous ...
Bacteria-virus arms race provides rare window into rapid and complex evolution
2023-11-09
As conceived by Charles Darwin in the 1800s, evolution is a slow, gradual process during which species adaptations are inherited incrementally over generations. However, today biologists can see how evolutionary changes unfold on much more accelerated timescales.
Rather than the evocative plants and animals of the Galapagos Islands that Darwin studied in forming his theory of evolution, Postdoctoral Scholar Joshua Borin and Associate Professor Justin Meyer of UC San Diego’s School of Biological Sciences are documenting rapid evolutionary processes in simple laboratory flasks.
Borin ...
Open-science “COVID Moonshot” discovers new antivirals to treat COVID-19
2023-11-09
Although the group’s work has been freely available since its inception in March 2020, the COVID Moonshot Consortium is finally formally reporting their results. The COVID Moonshot – an open-science, crowdsourced, and patent-free drug discovery campaign targeting the SARS-CoV-2 virus – has yielded a wealth of data on the virus’s main protease, including insights that could pave the way for the development of new and better therapeutics. “The lead therapeutics described by [these researchers] may not be ready in time to affect the current pandemic, considering the timelines and challenges of drug approval,” write Brian Shoichet and Charles ...
Shrinking a mode-locked laser to the size of an optical chip
2023-11-09
Setting out to improve a technology that usually requires bulky, bench-top equipment, Quishi Guo and colleagues have shrunk a mode-locked laser (MLL) to the size of an optical chip with an integrated nanophotonic platform. The results show promise for developing ultrafast nanophotonic systems for a wide range of applications. Mode-locked lasers (MLLs) can produce coherent ultrashort pulses of light at extremely fast speeds – on the order of picoseconds and femtoseconds. These devices have enabled numerous technologies in photonics, including extreme nonlinear optics, two-photon microscopy, ...
LAST 30 PRESS RELEASES:
Breakthrough idea for CCU technology commercialization from 'carbon cycle of the earth'
Keck Hospital of USC earns an ‘A’ Hospital Safety Grade from The Leapfrog Group
Depression research pioneer Dr. Philip Gold maps disease's full-body impact
Rapid growth of global wildland-urban interface associated with wildfire risk, study shows
Generation of rat offspring from ovarian oocytes by Cross-species transplantation
Duke-NUS scientists develop novel plug-and-play test to evaluate T cell immunotherapy effectiveness
Compound metalens achieves distortion-free imaging with wide field of view
Age on the molecular level: showing changes through proteins
Label distribution similarity-based noise correction for crowdsourcing
The Lancet: Without immediate action nearly 260 million people in the USA predicted to have overweight or obesity by 2050
Diabetes medication may be effective in helping people drink less alcohol
US over 40s could live extra 5 years if they were all as active as top 25% of population
Limit hospital emissions by using short AI prompts - study
UT Health San Antonio ranks at the top 5% globally among universities for clinical medicine research
Fayetteville police positive about partnership with social workers
Optical biosensor rapidly detects monkeypox virus
New drug targets for Alzheimer’s identified from cerebrospinal fluid
Neuro-oncology experts reveal how to use AI to improve brain cancer diagnosis, monitoring, treatment
Argonne to explore novel ways to fight cancer and transform vaccine discovery with over $21 million from ARPA-H
Firefighters exposed to chemicals linked with breast cancer
Addressing the rural mental health crisis via telehealth
Standardized autism screening during pediatric well visits identified more, younger children with high likelihood for autism diagnosis
Researchers shed light on skin tone bias in breast cancer imaging
Study finds humidity diminishes daytime cooling gains in urban green spaces
Tennessee RiverLine secures $500,000 Appalachian Regional Commission Grant for river experience planning and design standards
AI tool ‘sees’ cancer gene signatures in biopsy images
Answer ALS releases world's largest ALS patient-based iPSC and bio data repository
2024 Joseph A. Johnson Award Goes to Johns Hopkins University Assistant Professor Danielle Speller
Slow editing of protein blueprints leads to cell death
Industrial air pollution triggers ice formation in clouds, reducing cloud cover and boosting snowfall
[Press-News.org] How to use AI for discovery — without leading science astrayA new statistical technique allows researchers to safely use machine learning predictions to test scientific hypotheses