PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

UMass Amherst researchers bring dream of bug-free software one step closer to reality

Prize-winning method, called Baldur, automatically verifies software with nearly 66% efficacy

UMass Amherst researchers bring dream of bug-free software one step closer to reality
2024-01-04
(Press-News.org) AMHERST, Mass. – A team of computer scientists led by the University of Massachusetts Amherst recently announced a new method for automatically generating whole proofs that can be used to prevent software bugs and verify that the underlying code is correct. This new method, called Baldur, leverages the artificial intelligence power of Large Language Models (LLMs), and, when combined with the state-of-the-art tool Thor, yields unprecedented efficacy of nearly 66%. The team was recently awarded a coveted Distinguished Paper award at the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering.

“We have unfortunately come to expect that our software is buggy, despite the fact that it is everywhere and we all use it every day,” says Yuriy Brun, professor in the Manning College of Information and Computer Sciences at UMass Amherst and the paper’s senior author.

The effects of buggy software can range anywhere from the annoying—glitchy formatting or sudden crashes—to potentially catastrophic when it comes to security breaches or the precision software used for space exploration or for controlling health care devices.

Of course, there have been methods for checking software for as long as it has existed. One popular method is the simplest: you have a human being go through the code, line by line, manually verifying that there are no errors. Or you can run the code and check it against what you expect it to do. If, for instance, you expect your word-processing software to break the line every time you press the “return” key, but it instead outputs a question mark, then you know something in the code is wrong.

The problem with both methods is that they are prone to human error, and checking against every possible glitch is extraordinarily time consuming, expensive and infeasible for anything but trivial systems.

A much more thorough, but harder, method is to generate a mathematical proof showing that the code does what it is expected to do, and then use a theorem prover to make sure that the proof is also correct. This method is called machine-checking.

But manually writing these proofs is incredibly time-consuming and requires extensive expertise. “These proofs can be many times longer than the software code itself,” says Emily First, the paper’s lead author who completed this research as part of her doctoral dissertation at UMass Amherst.

With the rise of LLMs, of which ChatGPT is the most famous example, a possible solution is to try to generate such proofs automatically.  However, “one of the biggest challenges with LLMs is that they’re not always correct,” says Brun. “Instead of crashing and letting you know that something is wrong, they tend to ‘fail silently,’ producing an incorrect answer but presenting it as if it’s correct. And, often, the worst thing you can do is to fail silently.”

This is where Baldur comes in.

First, whose team performed its work at Google, used Minerva, an LLM trained on a large corpus of natural-language text, and then fine-tuned it on 118GB of mathematical scientific papers and webpages containing mathematical expressions. Next, she further fine-tuned the LLM on a language, called Isabelle/HOL, in which the mathematical proofs are written. Baldur then generated an entire proof and worked in tandem with the theorem prover to check its work. When the theorem prover caught an error, it fed the proof, as well as information about the error, back into the LLM, so that it can learn from its mistake and generate a new and hopefully error-free proof.

This process yields a remarkable increase in accuracy. The state-of-the-art tool for automatically generating proofs is called Thor, which can generate proofs 57% of the time. When Baldur (Thor’s brother, according to Norse mythology) is paired with Thor, the two can generate proofs 65.7% of the time.

Though there is still a large degree of error, Baldur is by far the most effective and efficient way yet devised to verify software correctness, and as the capabilities of AI are increasingly extended and refined, so should Baldur’s effectiveness grow.

In addition to First and Brun, the team includes Markus Rabe, who was employed by Google at the time, and Talia Ringer, an assistant professor at the University of Illinois—Urbana Champaign. This work was performed at Google and supported by the Defense Advanced Research Projects Agency and the National Science Foundation.

 

Contacts: Yuriy Brun, brun@cs.umass.edu

                 Daegan Miller, drmiller@umass.edu

 

 

 

 

END

[Attachments] See images for this press release:
UMass Amherst researchers bring dream of bug-free software one step closer to reality UMass Amherst researchers bring dream of bug-free software one step closer to reality 2 UMass Amherst researchers bring dream of bug-free software one step closer to reality 3

ELSE PRESS RELEASES FROM THIS DATE:

USC Stem Cell scientists develop a game-changing organoid model to study human cerebellar development and disease

2024-01-04
In a first for USC Stem Cell scientists, the laboratory of Giorgia Quadrato, an assistant professor of stem cell biology and regenerative medicine, has pioneered a novel human brain organoid model that generates all the major cell types of the cerebellum, a hindbrain region predominantly made up of two cell types necessary for movement, cognition, and emotion: granule cells and Purkinje neurons. This marks the first time that scientists have succeeded in growing Purkinje cells that possess the molecular and electrophysiological features of functional neurons in an all-human system. These breakthroughs in organoid-directed brain modeling have been published recently ...

Worm study raises concern about DEET's effect on reproduction

Worm study raises concern about DEETs effect on reproduction
2024-01-04
Researchers have uncovered evidence hinting that the most common bug spray ingredient, DEET, might cause reproductive problems by affecting the formation of egg cells during pregnancy. The findings come from a study in C. elegans — worms that don’t look like they have much in common with humans yet serve as surprisingly useful bellwethers of how toxins in the environment affect human reproduction. Get more HMS news here The research, published Jan. 4 in iScience, raises difficult questions. Chief among them is how to balance the possible reproductive harms ...

Hearing loss, hearing aid use, and risk of dementia in older adults

2024-01-04
About The Study: The results of this study that included 573,000 persons suggest that hearing loss was associated with increased dementia risk, especially among people not using hearing aids, suggesting that hearing aids might prevent or delay the onset and progression of dementia. The risk estimates were lower than in previous studies, highlighting the need for more high-quality longitudinal studies.  Authors: Manuella Lech Cantuaria, Ph.D., of the University of Southern Denmark in Odense, is the corresponding author. (doi:10.1001/jamaoto.2023.3509) To access the embargoed study: Visit our For The Media website at this link https://media.jamanetwork.com/ Editor’s ...

Limited English proficiency and sepsis mortality by race and ethnicity

2024-01-04
About The Study: The findings of this study of 2,709 patients hospitalized with sepsis from 2016 to 2019 at an urban tertiary care center suggest a language-based inequity in outcomes. Further studies are needed to understand drivers of this inequity, how it may manifest in other diverse health systems, and to inform equitable care models for patients with limited English proficiency.  Authors: Neha P. Limaye, M.D., M.P.H., of Mount Sinai Hospital in New York, is the corresponding author.   To access the embargoed study: Visit our For The Media website at this link ...

Study reveals new genetic link between anorexia nervosa and being an early riser

2024-01-04
BOSTON – New research indicates that the eating disorder anorexia nervosa is associated with being an early riser, unlike many other disorders that tend to be evening-based such as depression, binge eating disorder and schizophrenia. The study, which is published in JAMA Network Open and led by investigators at Massachusetts General Hospital (MGH), in collaboration with University College London and the University of the Republic in Uruguay, also revealed a link between anorexia nervosa and insomnia risk. Previous research has suggested a possible connection between eating disorders and the body’s internal clock, or circadian clock, which controls a wide range of biological ...

Immune atlas at cell level points to new combination treatment for incurable childhood cancer

2024-01-04
Press release: Princess Máxima Center for pediatric oncology EMBARGO: JANUARY 4, 2024 AT 11:00 AM ET (US)   A detailed 'atlas' of neuroblastoma tumors points to a new target for immunotherapy. Scientists from the Princess Máxima Center for pediatric oncology in the Netherlands mapped this childhood tumor at the level of individual cancer and immune cells. In doing so, they discovered a brake on the immune system that can be blocked with existing immunotherapy. The results in the lab are promising; preparations for a clinical study are underway. Every year, 25 children in the Netherlands ...

The Colorado Center for Personalized Medicine highlighted as a leader in precision medicine in research and clinical care

2024-01-04
A new peer-reviewed study in the American Journal of Human Genetics highlights the work of the biobank at the Colorado Center for Personalized Medicine (CCPM), a world-class site for precision medicine in research and clinical care created in partnership with the University of Colorado Anschutz Medical Campus and UCHealth. The paper focuses on CCPM’s research and personalized medical care.   “We’re one of the only institutions in the world that has accomplished the dual-purpose of using genetic information to accelerate scientific discovery in research while providing actionable clinical results to patients that ...

Treating tuberculosis when antibiotics no longer work

2024-01-04
In cooperation with research partners in Germany and France, the infectious disease specialist Dr Jan Rybniker and his team at University Hospital Cologne and the University of Cologne’s Faculty of Medicine have identified new, antibiotic molecules that target Mycobacterium tuberculosis and make it less pathogenic for humans. In addition, some of the discovered substances may allow for a renewed treatment of tuberculosis with available medications – including strains of the bacterium that have already developed drug resistance. The research has been published in the article ‘Discovery of dual-active ethionamide boosters ...

Supercharging CAR-T cells for cancer treatment

2024-01-04
At EPFL's School of Engineering, Professor Li Tang's Laboratory of Biomaterials for Immunoengineering has made significant strides in cancer treatment research. In laboratory settings, this innovative CAR-T therapy has consistently eradicated cancerous tumors in mouse models. Separately, in on-going clinical trials, eleven patients seemed to achieve complete remission using this treatment, marking a success rate of 100% to date. Notably, evidence from the lab study, published in Nature Biotechnology, suggests the therapy's long-term effectiveness, and indicates that its fabrication may be both quicker and more cost-effective ...

Minimizing immunotherapy’s potentially harmful side effects

Minimizing immunotherapy’s potentially harmful side effects
2024-01-04
MIAMI, FLORIDA (EMBARGOED UNTIL JAN. 4, 2024, at 10 AM EST) – Recent advances in treating multiple myeloma, the second most common blood cancer, and other blood malignancies are providing improved outcomes – and hope – to patients worldwide. But treatment breakthroughs such as the immunotherapy drug teclistamab can lead to potentially lethal side effects, including cytokine release syndrome (CRS) and immune cell-associated neurotoxicity syndrome (ICANS). These potential side effects have necessitated giving immunotherapy drugs in the hospital setting, where patients remain for five to seven days and receive other drugs ...

LAST 30 PRESS RELEASES:

Impact of pollutants on pollinators, and how neural circuits adapt to temperature changes

Researchers seek to improve advanced pain management using AI for drug discovery

‘Neutron Nexus’ brings universities, ORNL together to advance science

Early release from NEJM Evidence

UMass Amherst astronomer leads science team helping to develop billion-dollar NASA satellite mission concept

Cultivating global engagement in bioengineering education to train students skills in biomedical device design and innovation

Life on Earth was more diverse than classical theory suggests 800 million years ago, a Brazilian study shows

International clean energy initiative launches global biomass resource assessment

How much do avoidable deaths impact the economy?

Federal government may be paying twice for care of veterans enrolled in Medicare Advantage plans

New therapeutic target for cardiac arrhythmias emerges

UC Irvine researchers are first to reveal role of ophthalmic acid in motor function control

Moffitt study unveils the role of gamma-delta T cells in cancer immunology

Drier winter habitat impacts songbirds’ ability to survive migration

Donors enable 445 TPDA awards to Neuroscience 2024

Gut bacteria engineered to act as tumor GPS for immunotherapies

Are auditory magic tricks possible for a blind audience?

Research points to potential new treatment for aggressive prostate cancer subtype

Studies examine growing US mental health safety net

Social risk factor domains and preventive care services in US adults

Online medication abortion direct-to-patient fulfillment before and after the Dobbs v Jackson decision

Black, Hispanic, and American Indian adolescents likelier than white adolescents to be tested for drugs, alcohol at pediatric trauma centers

Pterosaurs needed feet on the ground to become giants

Scientists uncover auditory “sixth sense” in geckos

Almost half of persons who inject drugs (PWID) with endocarditis will die within five years; women are disproportionately affected

Experimental blood test improves early detection of pancreatic cancer

Groundbreaking wastewater treatment research led by Oxford Brookes targets global challenge of toxic ‘forever chemicals’

Jefferson Health awarded $2.4 million in PCORI funding

Cilta-cel found highly effective in first real-world study

Unleashing the power of generative AI on smart collaborative innovation network platform to empower research and technology innovation

[Press-News.org] UMass Amherst researchers bring dream of bug-free software one step closer to reality
Prize-winning method, called Baldur, automatically verifies software with nearly 66% efficacy