PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

Compressed data technique enables pangenomics at scale

2026-01-12
(Press-News.org) Engineers at the University of California have developed a new data structure and compression technique that enables the field of pangenomics to handle unprecedented scales of genetic information. The team, led by UC San Diego electrical and computer engineering professor Yatish Turakhia, described their compressive pangenomics approach in Nature Genetics on Jan. 12, 2026.

Pangenomics, a subset of bioinformatics, is the study of many different genomes from one specific species. This can provide a more holistic picture of the natural variation and mutations that occur within a species than using one singular reference genome. This has many practical applications, such as studying how genomic mutations lead to increased transmissibility or drug resistance in pathogens.

Although advances in genome sequencing technologies have reduced the cost and increased the speed of sequencing, the data structures and analysis tools needed to study and graphically represent the relationships between millions of sequenced genomes remain a challenge. While graph-based data formats for pangenomes have become popular and widely adopted, they only represent the genetic variation in a collection of genomes, not their shared evolutionary and mutational histories. They also have large storage requirements that do not scale well.

“The data structures used for pangenomics research are critical because they determine not only how efficiently genetic data is represented, but also what the data can represent,” said Sumit Walia, an electrical engineering PhD candidate at the Jacobs School of Engineering and co-first author of the study. 

The research team, which includes engineers from the Genomics Institute at UC Santa Cruz, pioneered a new data structure and file format, called Pangenome Mutation-Annotated Network (PanMAN). PanMAN not only provides unmatched compression for pangenomes but also significantly advances the representative power by encoding additional biologically relevant information, including phylogenies, mutations, and whole-genome alignments. Their compressive pangenomics approach can perform analysis on compressed pangenomic data, allowing researchers to handle vastly larger scales of genetic data than currently possible. 

“Our compressive technique with PanMANs allows doing more with less, greatly improving the scale and scope of current pangenomic analysis”, said Turakhia, the study’s corresponding author.

PanMANs are composed of mutation-annotated trees, called PanMATs, which store a single ancestral genome sequence at the root and annotate mutations, such as substitutions, insertions, and deletions, on the different branches. Multiple PanMATs are connected in the form of a network using edges to generate a PanMAN. These edges store complex mutations, such as recombination and horizontal gene transfer data, which result in sequences involving multiple parent sequences and violate the vertical inheritance assumption of single trees. This representation is compact as it exploits the shared ancestry among genomes, representing each mutation only once on the branch where it arose instead of duplicating them across individual sequences.

In addition, PanMAN was crafted to represent a rich set of biologically meaningful information that current pangenome formats lack. Some information in PanMAN is explicitly stored, such as mutations, phylogeny, annotations, and root sequence, whereas other information can be derived, such as ancestral sequences, multiple whole-genome alignment, and genetic variation.

So far, the researchers have used PanMAN to study microbial genomes. They have found that this method is the most compressible format among variation-preserving pangenomic formats, providing up to hundreds or even thousands of times more compression. For example, the team built the largest pangenome for SARS-CoV-2, using more than 8 million separate genomes of the virus. Using their PanMAN method, this vast amount of genetic data only required 366MB of file storage space, which is roughly 3,000 times less storage than its corresponding whole-genome alignment that PanMAN encodes. Constructing an alignment for SARS-CoV-2 genomes at this scale was itself a formidable challenge, which was addressed by another computational tool developed at Turakhia’s lab, called TWILIGHT.

Now, the researchers are expanding their use of TWILIGHT and PanMANs from microbes to human genomes. Turakhia and Melissa Gymrek, a professor of computer science and engineering at UC San Diego, received a Jacobs School Early Career Faculty Development Award to advance this effort. 

“Extending compressive pangenomics to human genomes can fundamentally transform how we store, analyze, and share large-scale human genetic data,” said Turakhia. “Besides enabling studies of human genetic diversity, disease, and evolution at unprecedented scale and speed, it can depict detailed evolutionary and mutational histories which shape diverse human populations, something that current representations do not capture.”

Full study: Compressive pangenomics using mutation-annotated networks

END


ELSE PRESS RELEASES FROM THIS DATE:

How brain waves shape our sense of self

2026-01-12
A new study from Karolinska Institutet, published in Nature Communications, reveals how rhythmic brain waves known as alpha oscillations help us distinguish between our own body and the external world. The findings offer new insights into how the brain integrates sensory signals to create a coherent sense of bodily self. What makes you feel that your hand is yours? It might seem obvious, but the brain’s ability to tell self from non-self is a complex process. Using a combination of behavioural experiments, brain recordings (EEG), brain stimulation, and computational modelling with a total of 106 participants, ...

Whole-genome sequencing may optimize PARP inhibitor use

2026-01-12
A whole-genome sequencing approach shows early promise over current commercial methods for identifying more patients likely to benefit from PARP inhibitor cancer treatments, according to a study led by Weill Cornell Medicine and NewYork-Presbyterian investigators. The findings suggest further development of this approach is merited. In the study, published Jan. 12 in Communications Medicine, the researchers performed whole-genome sequencing analysis on hundreds of tumor samples obtained by informed consent as part of a precision medicine initiative by Weill Cornell, NewYork-Presbyterian and Illumina, ...

Like alcohol units, but for cannabis – experts define safer limits

2026-01-12
Researchers at the University of Bath in the UK are proposing thresholds for safe – or at least safer – cannabis use and hope their findings will help people monitor consumption and keep it within recommended limits – similar to how alcohol units guide safer drinking. The threshold recommendations, proposed in a paper published today in the journal Addiction, are based on a system for measuring cannabis consumption not by weight but by THC content (THC is the compound responsible for the psychoactive effects of cannabis). In the same ...

DNA testing of colorectal polyps improves insight into hereditary risks

2026-01-12
In about 5–10% of colorectal cancer patients, hereditary factors play a role, with higher percentages among younger patients. Research from Radboud university medical center and university hospital Bonn (UKB) in collaboration with researchers from Munich and Barcelona, shows that DNA analysis of colorectal polyps provides important additional information on the development of these polyps and colorectal cancer. This DNA analysis leads to better diagnostics and treatment ...

Researchers uncover axonal protein synthesis defect in ALS

2026-01-12
Leuven, January 12, 2026 – Researchers at VIB and KU Leuven have identified a molecular process that allows motor neurons to maintain protein production, a process that fails in amyotrophic lateral sclerosis (ALS). The study, published in Nature Neuroscience, reveals an early weakness in neurodegeneration and highlights a potential target for future therapies. Building proteins Motor neurons depend on local protein production within their axons to support their long-distance connections to muscles. Using advanced spatial transcriptomics, scientists at the VIB–KU Leuven Center for Brain & Disease Research analyzed gene expression ...

Why are men more likely to develop multiple myeloma than women?

2026-01-12
Rates of multiple myeloma (MM), the second most common blood cancer in the United States, are increasing and are twice as high in men than in women. A new study published by Wiley online in CANCER, a peer-reviewed journal of the American Cancer Society, provides insights that may help to explain this disparity. To investigate the sex difference in MM, researchers analyzed data on 850 patients with newly diagnosed MM enrolled in the Integrative Molecular And Genetic Epidemiology (IMAGE) study at the University of Alabama at Birmingham. Compared with female patients, male patients were more likely to have advanced (International Staging System stage III) disease at the time of diagnosis. Males ...

Smartphone-based interventions show promise for reducing alcohol and cannabis use: New research

2026-01-12
by W.B. Kagan PISCATAWAY, NJ – Young adults today are digital natives—naturally fluent with devices and online platforms—so some of their most effective behavioral-health interventions will likely arrive in their pockets via text, app, or other mobile medium. Now, new research shows that such interventions for alcohol and cannabis use among young adults show potential to reduce harms, according to three reports in the Journal of Studies on Alcohol and Drugs. Heavy drinking and cannabis use among young adults continue to exact a great cost from individuals and society, ...

How do health care professionals determine eligibility for MAiD?

2026-01-12
How do health care professionals in Canada assess applicants for medical assistance in dying (MAiD)? A research article in CMAJ (Canadian Medical Association Journal) https://www.cmaj.ca/lookup/doi/10.1503/cmaj.251071 describes the careful approach currently used to determine eligibility, and an analysis article suggests an approach to eligibility assessments for advance requests for MAiD — which are currently available in Quebec and being considered elsewhere in Canada. In 2021, Canada ...

Microplastics detected in rural woodland 

2026-01-12
Air-polluting microplastics have been found in rural environments in greater quantities than in urban locations, researchers say.  Scientists led by the University of Leeds detected up to 500 microscopic particles of plastic per square metre per day in an area of woodland during the three-month study – almost twice as much as in a sample collected in a city centre.  They believe trees and other vegetation capture airborne microplastic particles from the atmosphere and deposit them, highlighting the impact ...

JULAC and Taylor & Francis sign open access agreement to boost the impact of Hong Kong research

2026-01-12
Researchers in Hong Kong will have greater opportunities to share their work with a global audience through a new open access (OA) agreement between the Joint University Librarians Advisory Committee (JULAC) and Taylor & Francis. The three-year agreement enables researchers at all participating institutions to publish OA articles in over 2,000 Taylor & Francis and Routledge Open Select (hybrid) journals without payment of an OA article publishing charge. Articles will be open on publication and free to access and reuse for readers around the world, ...

LAST 30 PRESS RELEASES:

Discovery of a new superfluid phase in non-Hermitian quantum systems

Codes in the cilia: New study maps how Cilk1 and Hedgehog levels sculpt tooth architecture

Chonnam National University researchers develop novel virtual sensor grid method for low-cost, yet robust, infrastructure monitoring

Expanded school-based program linked to lower youth tobacco use rates in California

TV depictions of Hands-Only CPR are often misleading

What TV gets wrong about CPR—and why it matters for saving lives

New study: How weight loss benefits the health of your fat tissue

Astronomers surprised by mysterious shock wave around dead star

‘Death by a thousand cuts’: Young galaxy ran out of fuel as black hole choked off supplies

Glow with the flow: Implanted 'living skin' lights up to signal health changes

Compressed data technique enables pangenomics at scale

How brain waves shape our sense of self

Whole-genome sequencing may optimize PARP inhibitor use

Like alcohol units, but for cannabis – experts define safer limits

DNA testing of colorectal polyps improves insight into hereditary risks

Researchers uncover axonal protein synthesis defect in ALS

Why are men more likely to develop multiple myeloma than women?

Smartphone-based interventions show promise for reducing alcohol and cannabis use: New research

How do health care professionals determine eligibility for MAiD?

Microplastics detected in rural woodland 

JULAC and Taylor & Francis sign open access agreement to boost the impact of Hong Kong research

Protecting older male athletes’ heart health 

KAIST proposes AI-driven strategy to solve long-standing mystery of gene function

Eye for trouble: Automated counting for chromosome issues under the microscope

The vast majority of US rivers lack any protections from human activities, new research finds

Ultrasound-responsive in situ antigen "nanocatchers" open a new paradigm for personalized tumor immunotherapy

Environmental “superbugs” in our rivers and soils: new one health review warns of growing antimicrobial resistance crisis

Triple threat in greenhouse farming: how heavy metals, microplastics, and antibiotic resistance genes unite to challenge sustainable food production

Earthworms turn manure into a powerful tool against antibiotic resistance

AI turns water into an early warning network for hidden biological pollutants

[Press-News.org] Compressed data technique enables pangenomics at scale