Medicine Technology 🌱 Environment Space Energy Physics Engineering Social Science Earth Science Science
Medicine 2026-03-12 3 min read

How do plants that look alike share so little recognizable DNA? A 300-million-year answer

An international team analyzed 284 plant genomes to find 2.3 million conserved regulatory sequences - including 3,000 that predate all flowering plants.

Here is a puzzle that has nagged plant biologists for decades: leaves develop in remarkably similar ways across wildly different plant species. Stems elongate by the same basic logic. Flowers, for all their diversity, share deep developmental grammar. The genes controlling these processes are often recognizable from one species to the next. But the regulatory DNA - the instructions that tell those genes when and where to switch on - has been almost impossible to find.

Why? If plants share developmental programs, they should share the regulatory code driving those programs. Where did it go?

The genome reshuffling problem

The answer, it turns out, is that plant genomes have been rewriting themselves for hundreds of millions of years. Unlike animal genomes, which tend to maintain relatively stable chromosomal structures, plant genomes undergo frequent whole-genome duplications, massive rearrangements, and wholesale reshuffling of DNA segments. These events scramble the order of genetic elements so thoroughly that standard comparison methods - the ones that work well for finding shared regulatory sequences in vertebrates - fail when applied to plants.

The regulatory code was not gone. It was hidden in plain sight, buried under layers of genomic reorganization.

Conservatory: reading 284 genomes at once

An international team led by Prof. Idan Efroni of the Hebrew University of Jerusalem, Prof. Zachary Lippman of Cold Spring Harbor Laboratory, and Prof. Madelaine E. Bartlett of the University of Cambridge built a computational tool called Conservatory to crack the problem. Rather than comparing genomes in pairs, the tool works incrementally, assembling matches piece by piece across increasingly distant species.

The team analyzed 284 plant genomes. The scale matters because spotting a conserved sequence between, say, rice and corn is comparatively easy - they diverged relatively recently. Finding one shared between a fern and a daisy requires vastly more comparative power, and it is exactly those deep-time connections that reveal the most fundamental regulatory logic.

The analysis identified approximately 2.3 million conserved regulatory sequences across the plant kingdom. More than 3,000 of these predate the origin of flowering plants entirely, placing their origins at least 300 million years ago.

Ancient code, modern consequences

The oldest regulatory elements clustered near genes that control plant body architecture, particularly members of the HOMEOBOX gene family. These genes are the master regulators of plant form - they direct how shoots, roots, leaves, and flowers develop.

When the researchers experimentally mutated conserved sequences near some HOMEOBOX genes, the plants developed severe developmental abnormalities. The ancient regulatory elements are not fossils. They are still running the show.

The study also uncovered principles governing how regulatory code evolves in plants. While the spacing between regulatory elements may change, their order along the chromosome tends to be preserved. Chromosomal rearrangements sometimes forge new regulatory partnerships. And when genes duplicate - a common event in plants - their ancient regulatory elements are preferentially retained, even as some copies evolve new, lineage-specific functions.

Tuning genes, not changing them

The findings carry significant implications for agriculture. Many crop traits that matter - yield, drought tolerance, disease resistance - depend not just on which genes a plant carries but on how those genes are regulated. Regulatory variation explains why two rice varieties with nearly identical genes can perform very differently in a field.

Understanding the deeply conserved regulatory architecture of plants opens new approaches to crop improvement. Instead of engineering new genes, breeders and synthetic biologists might fine-tune existing regulatory elements to adjust gene expression. This could accelerate the development of more resilient and productive crop varieties, potentially with fewer unintended consequences than wholesale genetic modification.

Scale and limits

The study is the most comprehensive map of conserved plant regulatory DNA produced to date, but it is not complete. Many plant lineages remain underrepresented in genome databases, and the computational methods, while powerful, rely on sequence similarity - functional conservation that has diverged beyond sequence recognition will be missed.

Experimental validation was performed on a handful of regulatory elements. The 2.3 million identified sequences represent predictions, and most have not yet been individually tested. Still, the severity of the developmental defects caused by mutating just a few of them suggests the catalog contains a great deal of genuinely functional regulatory DNA.

Three hundred million years of genome chaos, and the core instructions survived. Plants, it seems, can rewrite almost everything about their genomes except the rules for building a body.

Source: Published March 12, 2026 in Science. Led by Prof. Idan Efroni (Hebrew University of Jerusalem), Prof. Zachary Lippman (Cold Spring Harbor Laboratory), and Prof. Madelaine E. Bartlett (University of Cambridge). Tool: Conservatory. Contact: pressoffice@savion.huji.ac.il