PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

With Evo 2, AI can model and design the genetic code for all domains of life

The largest foundation model for biology to date is now published in the journal Nature

2026-03-04
(Press-News.org) The DNA foundation model Evo 2, first released in February 2025 as a preprint, is now published in the journal Nature. Trained on the DNA of over 100,000 species across the entire tree of life, Evo 2 can identify patterns in gene sequences across disparate organisms that experimental researchers would need years to uncover. The machine learning model can accurately identify disease-causing mutations in human genes and is capable of designing new genomes that are as long as the genomes of simple bacteria.

Evo 2 was developed by scientists from Arc Institute and NVIDIA, convening collaborators across Stanford University, UC Berkeley, and UC San Francisco. The model's code is publicly accessible from Arc's GitHub, and is also integrated into the NVIDIA BioNeMo framework, as part of a collaboration between Arc Institute and NVIDIA to accelerate scientific research. Arc Institute also worked with AI research lab Goodfire to develop a mechanistic interpretability visualizer that uncovers the key biological features and patterns the model learns to recognize in genomic sequences. The Evo team has shared its training data, training and inference code, and model weights, making it the largest-scale, fully open source AI model to date.

Building on its predecessor Evo 1, which was trained entirely on single-cell genomes, Evo 2 is the largest artificial intelligence model in biology to date, trained on over 9.3 trillion nucleotides—the building blocks that make up DNA or RNA—from over 128,000 whole genomes as well as metagenomic data. In addition to an expanded collection of bacterial, archaeal, and phage genomes, Evo 2 includes information from humans, plants, and other single-celled and multi-cellular species in the eukaryotic domain of life.

"Our development of Evo 1 and Evo 2 represents a key moment in the emerging field of generative biology, as the models have enabled machines to read, write, and think in the language of nucleotides," says Patrick Hsu, Arc Institute Co-Founder, Arc Core Investigator, an Assistant Professor of Bioengineering and Deb Faculty Fellow at University of California, Berkeley, and a co-senior author on the paper. "Evo 2 has a generalist understanding of the tree of life that's useful for a multitude of tasks, from predicting disease-causing mutations to designing potential code for artificial life. We're excited to see what the research community builds on top of these foundation models."

Evolution has encoded biological information in DNA and RNA, creating patterns that Evo 2 can detect and utilize. "Just as the world has left its imprint on the language of the Internet used to train large language models, evolution has left its imprint on biological sequences," says co-senior author Brian Hie, an Assistant Professor of Chemical Engineering at Stanford University, the Dieter Schwarz Foundation Stanford Data Science Faculty Fellow, and Arc Institute Innovation Investigator in Residence. "These patterns, refined over millions of years, contain signals about how molecules work and interact."

Evo 2 was trained for several months on the NVIDIA DGX Cloud AI platform via AWS, utilizing over 2,000 NVIDIA H100 GPUs and bolstered by collaboration with NVIDIA researchers and engineers. The model can process genetic sequences of up to 1 million nucleotides at once, enabling it to understand relationships between distant parts of a genome. Achieving this technical feat required the research team to reimagine how an AI model could quickly ingest and make inferences about this scale of data. The resulting AI architecture, called StripedHyena 2, enabled Evo 2 to be trained with 30 times more data than Evo 1 and reason over 8 times as many nucleotides at a time.

The model already shows enough versatility to identify genetic changes that affect protein function and organism fitness. For example, in tests with variants of the breast cancer-associated gene BRCA1, Evo 2 achieved over 90% accuracy in predicting which mutations are benign versus potentially pathogenic. Insights like this could save countless hours and research dollars needed to run cell or animal experiments, by finding genetic causes of human diseases and accelerating the development of new medicines.

In the year since its preprint release, researchers have applied the model to a range of scientific problems, from predicting genetic disease risk in Alzheimer's patients to assessing variant effects across domesticated animal species. Arc researchers have also used Evo 2 to design functional synthetic bacteriophages, demonstrating potential applications for treating antibiotic-resistant bacteria.

In addition to genetic analysis, Evo 2 could be useful for engineering new biological tools or treatments. "If you have a gene therapy that you want to turn on only in neurons to avoid side effects, or only in liver cells, you could design a genetic element that is only accessible in those specific cells," says co-author and computational biologist Hani Goodarzi, an Arc Core Investigator and an Associate Professor of Biochemistry and Biophysics at the University of California, San Francisco. "This precise control could help develop more targeted treatments with fewer side effects."

The research team envisions that more specific AI models could be built with Evo 2 as a foundation. "In a loose way, you can think of the model almost like an operating system kernel—you can have all of these different applications that are built on top of it," says Arc's Chief Technology Officer Dave Burke, a co-author on the paper. "From predicting how single DNA mutations affect a protein's function to designing genetic elements that behave differently in different cell types, as we continue to refine the model and researchers begin using it in creative ways, we expect to see beneficial uses for Evo 2 we haven't even imagined yet."

In consideration of potential ethics and safety risks, the scientists excluded pathogens that infect humans and other complex organisms from Evo 2's base data set, and ensured that the model would not return productive answers to queries about these pathogens. Co-author Tina Hernandez-Boussard, a Stanford Professor of Medicine, and her lab members assisted the team to implement responsible development and deployment of this technology.

"Evo 2 has fundamentally advanced our understanding of biological systems," says Anthony Costa, director of digital biology at NVIDIA. "By overcoming previous limitations in the scale of biological foundation models with a unique architecture and the largest integrated dataset of its kind, Evo 2 generalizes across more known biology than any other model to date — and by releasing these capabilities broadly, Arc Institute has given scientists around the world a new partner in solving humanity's most pressing health and disease challenges."

###

Brixi, G., Durrant, M.G., Ku, J., Naghipourfar, M., Poli, M., Brockman, G., Chang, D., Fanton, A., Gonzalez, G.A., King, S.H., Li, D.B., Merchant, A.T., Nguyen, E., Ricci-Tam, C., Romero, D.W., Schmok, J.C., Sun, G., Taghibakhshi, A., Vorontsov, A., Yang, B., Deng, M., Gorton, L., Nguyen, N., Wang, N.K., Pearce, M.T., Simon, E., Adams, E., Amador, Z.J., Ashley, E.A., Baccus, S.A., Dai, H., Dillmann, S., Ermon, S., Guo, D., Herschl, M.H., Ilango, R., Janik, K., Lu, A.X., Mehta, R., Mofrad, M.R.K., Ng, M.Y., Pannu, J., Ré, C., St. John, J., Sullivan, J., Tey, J., Viggiano, B., Zhu, K., Zynda, G., Balsam, D., Collison, P., Costa, A.B., Hernandez-Boussard, T., Ho, E., Liu, M.-Y., McGrath, T., Powell, K., Pinglay, S., Burke, D.P., Goodarzi, H., Hsu, P.D., & Hie, B.L.  (2026). Genome modeling and design across all domains of life with Evo 2. Nature. https://doi.org/10.1038/s41586-026-10176-5

Arc Institute is an independent nonprofit research organization based in Palo Alto, California, that aims to accelerate scientific progress and understand the root causes of complex diseases. Arc's investigators are supported by long-term funding and freedom to pursue bold ideas. Its Technology Centers leverage multi-omics, genome engineering, and cellular, mammalian and computational models to advance discoveries at the intersection of biology and artificial intelligence. Founded in 2021, Arc partners with Stanford, UC Berkeley, and UCSF.

END



ELSE PRESS RELEASES FROM THIS DATE:

Discovery of why only some early tumors survive could help catch and treat cancer at very earliest stages

2026-03-04
Cambridge scientists have shown that when tumours first emerge, interactions with healthy cells in the underlying supportive tissue determine their ability to survive, grow, and progress to advanced stages of disease. The study, carried out in mice and further validated using human tissue, may explain why some tiny, newly-formed tumours disappear, while others manage to survive and eventually grow into cancer. Tumours arise when our DNA accumulates errors, or mutations, causing the cells to grow faster and ignore signals that would otherwise instruct ...

Study reveals how gut bacteria and diet can reprogram fat to burn more energy

2026-03-04
LOS ANGELES — Scientists at City of Hope®, one of the largest and most advanced cancer research and treatment organizations in the U.S. and a leading research center for diabetes, the Broad Institute and Keio University have discovered how specific gut bacteria work together with diet to flip a metabolic switch — transforming energy‑storing white fat into calorie‑burning beige fat in mice.  The study, published today in Nature, shows that a low‑protein ...

Mayo Clinic researchers link Parkinson's-related protein to faster Alzheimer's progression in women

2026-03-04
ROCHESTER, Minn. — Alzheimer's-related brain changes progressed up to 20 times faster in women who also had abnormal levels of a Parkinson's-related protein, according to a Mayo Clinic study published in JAMA Network Open. The same pattern was not observed in men. The findings suggest that when alpha-synuclein — a protein linked to Parkinson's disease — accumulates alongside Alzheimer's pathology, it may drive faster disease progression in women. That interaction could help explain a long-standing disparity: women make up nearly two-thirds of ...

Trends in metabolic and bariatric surgery use during the GLP-1 receptor agonist era

2026-03-04
About The Study: Among metabolic and bariatric surgery (MBS)-eligible patients in a national sample, semaglutide and tirzepatide prescriptions increased dramatically between 2018 and 2025, whereas MBS use rates declined substantially beginning in 2023. Stratification by procedure type and body mass index (BMI) category suggests that recent shifts in MBS use may be more pronounced in certain patient subgroups (e.g., those seeking sleeve gastrectomy or with lower BMIs). Corresponding Author: To ...

Loneliness, anxiety symptoms, depressive symptoms, and suicidal ideation in the all of us dataset

2026-03-04
About The Study: In this cross-sectional study of 62,685 participants from the All of Us Research Program, loneliness partially mediated the association between anxiety symptoms and suicidal ideation as well as depressive symptoms and suicidal ideation. Targeting and reducing loneliness may present a transdiagnostic approach to arrest the progression from anxiety and depressive symptoms toward suicidal ideation.  Corresponding Author: To contact the corresponding author, Katherine Musacchio Schafer, PhD, email katherine.m.schafer@vumc.org. To ...

A decision-support system to personalize antidepressant treatment in major depressive disorder

2026-03-04
About The Study: Compared with usual care, use of the PETRUSHKA tool increased the number of patients still taking their antidepressant at 8 weeks and improved depressive and anxiety symptoms at 24 weeks. However, lack of a double-blind design and the large amount of missing data limit the validity of these results. The PETRUSHKA tool is a web-based clinical decision-support system combining clinical and demographic predictors with patient preferences to personalize antidepressant treatment.  Corresponding Author: To contact the corresponding author, Andrea Cipriani, MD, PhD, ...

Thunderstorms don’t just appear out of thin air - scientists' key finding to improve forecasting

2026-03-04
People may be frustrated by the lack of detail when weather forecasters say “there will be thunderstorms popping up, but we don’t know where”. Now a key finding in a study by the UK Centre for Ecology & Hydrology (UKCEH), published in the journal Nature, is set to improve the certainty about the location of upcoming storms on hot days. Climate change is bringing more intense rainfall, and improving forecasting and warnings to communities globally will save the lives of people and livestock as well as better protect property and infrastructure. Thunderstorms caused around 30,000 ...

Automated CT scan analysis could fast-track clinical assessments

2026-03-04
A research team funded by the National Institutes of Health (NIH) has developed a versatile machine learning model that could one day greatly expand what medical scans can tell us about disease. Scientists used their tool, named Merlin, to assess 3D abdominal computed tomography (CT) scans, accomplishing tasks as simple as identifying anatomical features to as complex as predicting disease onset years in advance. Despite being developed as a general-purpose CT model, Merlin surpassed a gauntlet of similar automated tools in tasks they were specifically built to handle.   The team trained their model on a unique set of patient CT scans ...

New UNC Charlotte study reveals how just three molecules can launch gene-silencing condensates, organizing the epigenome and controlling stem cell differentiation

2026-03-04
A new study has uncovered how an exceptionally scarce protein can orchestrate the assembly of large‑scale gene-silencing structures inside cells, and what happens when that process breaks down.  The findings, published today in Molecular Cell, identify a self-clustering mechanism in the Polycomb protein CBX2 that is essential for initiating the formation of gene-repressive condensates and guiding stem cells toward their proper fates.   Polycomb complexes are essential for establishing and maintaining cell identity, yet the physical principles behind their repression have remained elusive. The challenge is that some of these molecules are typically present ...

Oldest known bony fish fossils uncover early vertebrate evolution

2026-03-04
A research team led by Profs. ZHU Min, LU Jing, and ZHU You'an from the Institute of Vertebrate Paleontology and Paleoanthropology (IVPP) of the Chinese Academy of Sciences published two back-to-back cover stories in the journal Nature on March 4, reporting new discoveries about the origin of bony fishes. The team has unearthed the oldest known fossils of bony fishes, revealing the morphology and key anatomical features—including jaws, teeth, and braincases—of two primitive bony fish species. Phylogenetic analyses place both taxa within the previously little-known bony fish stem group, representing ...

LAST 30 PRESS RELEASES:

Machine learning reveals Raman signatures of liquid-like ion conduction in solid electrolytes

Children’s Hospital of Philadelphia researchers emphasize benefits and risks of generative AI at different stages of childhood development

Why conversation is more like a dance than an exchange of words

With Evo 2, AI can model and design the genetic code for all domains of life

Discovery of why only some early tumors survive could help catch and treat cancer at very earliest stages

Study reveals how gut bacteria and diet can reprogram fat to burn more energy

Mayo Clinic researchers link Parkinson's-related protein to faster Alzheimer's progression in women

Trends in metabolic and bariatric surgery use during the GLP-1 receptor agonist era

Loneliness, anxiety symptoms, depressive symptoms, and suicidal ideation in the all of us dataset

A decision-support system to personalize antidepressant treatment in major depressive disorder

Thunderstorms don’t just appear out of thin air - scientists' key finding to improve forecasting

Automated CT scan analysis could fast-track clinical assessments

New UNC Charlotte study reveals how just three molecules can launch gene-silencing condensates, organizing the epigenome and controlling stem cell differentiation

Oldest known bony fish fossils uncover early vertebrate evolution

High‑performance all‑solid‑state magnesium-air rechargeable battery enabled by metal-free nanoporous graphene

Improving data science education using interest‑matched examples and hands‑on data exercises

Sparkling water helps keep minds sharp during long esports sessions

Drone LiDAR surveys of abandoned roads reveal long-term debris supply driving debris-flow hazards

UGA Bioinformatics doctoral student selected for AIBS and SURA public policy fellowship

Gut microbiome connected with heart disease precursor

Nitrous oxide, a product of fertilizer use, may harm some soil bacteria

FAU lands $4.5M US Air Force T-1A Jayhawk flight simulator

SimTac: A physics-based simulator for vision-based tactile sensing with biomorphic structures

Preparing students to deal with ‘reality shock’ in the workplace

Researchers develop beating, 3D-printed heart model for surgical practice

Black soldier fly larvae show promise for safe organic waste removal

People with COPD commonly misuse medications

How periodontitis-linked bacteria accelerate osteoporosis-like bone loss through the gut

Understanding how cells take up and use isolated ‘powerhouses’ to restore energy function

Ten-point plan to deliver climate education unveiled by experts

[Press-News.org] With Evo 2, AI can model and design the genetic code for all domains of life
The largest foundation model for biology to date is now published in the journal Nature