October 11, 2024 – Genome Research (https://genome.org) publishes a special issue highlighting novel advances in computational biology.
In collaboration with the International Conference on Research in Computational Molecular Biology (RECOMB), Genome Research publishes a collection of 20 computational methods and their applications in genomics including spatial, single-cell, and long-read sequencing. These include algorithmic innovations in genomic variation analysis, privacy-preserving algorithms, DNA structural properties, cancer genomics, transcriptomic studies, gene regulatory networks, biomolecular representation learning, and metagenomic data analysis. Several of these studies are highlighted below.
PRiMeR (Sens et al. 2024) is a method that leverages genetic information to learn disease risk predictors across cohorts, circumventing the need for traditional longitudinal studies. With training on risk factors and genetic data from a healthy cohort, along with results from genome-wide association studies (GWAS), PRiMeR can assess risk for new patients. This method was validated on simulations of type 2 diabetes and Alzheimer’s and Parkinson’s disease onset. This method could facilitate more timely and targeted preventive strategies.
In another study, Hong et al. (2024) developed SF-Relate, a practical and secure federated algorithm for identifying genetic relatives across distributed genomic datasets. Using novel hashing and bucketing strategies, SF-Relate distinguishes relatives from nonrelatives and securely estimates kinship using encrypted data. This method allows for the exclusion of close relatives that can introduce bias in study results while providing privacy protection.
Circular extrachromosomal DNA (ecDNA) is a form of oncogene amplification found across cancer types and is associated with poor outcome in patients. EcDNAs drive tumor formation, evolution, and drug resistance by modulating oncogene copy-number and rewiring gene-regulatory networks. Two methods CoRAL (Zue et al. 2024) and Decoil (Giurgiu et al. 2024) resolve ecDNA structure using long-read sequencing data, profiling the landscape and evolution of focal amplifications in tumors.
Another method, DIISCO (Park et al. 2024), characterizes the temporal dynamics of cell–cell interactions in complex biological systems using single-cell RNA sequencing data, elucidating mechanisms underlying normal biological processes and disease progression. This method was demonstrated on simulated and experimental lymphoma–immune interaction data and revealed immune interactions of a cytotoxic T cell subtype that expands with therapy. This method can guide the design of improved treatments to promote cell states and crosstalk crucial for therapeutic response.
Schrod et al. (2024) present SpaCeNet, a method for analyzing patterns of correlation in spatial transcriptomics data, facilitating reconstruction of both the intracellular and the intercellular interaction networks with single-cell spatial resolution. SpaCeNet was validated on several datasets including mouse visual cortex, mouse organoids, and the Drosophila blastoderm revealing insights into the spatial organization of cell populations capturing complex patterns of interactions related to cellular growth, development, and disease.
Finally, repetitive DNA poses significant challenges for accurate and efficient genome assembly and sequence alignment. This is particularly true for metagenomic data, where genome dynamics such as horizontal gene transfer, gene duplication, and gene loss/gain complicate accurate genome assembly from microbial communities. Detecting repeats is a crucial first step in overcoming these challenges. Azizpour et al. (2024) presents GraSSRep, a novel approach that detects and classifies DNA sequences into repetitive and non-repetitive categories in metagenomics data.
Additional computational methods that advance studies in genomic variation, genome structure, cancer genomics, transcriptomics, gene regulation, data privacy preservation, and metagenomic data analysis are also included in this Special Issue.
Interested reporters may obtain copies of the manuscript via email from Tara Bonet-Black, Administrative Assistant, Genome Research (bonetbl@cshl.edu).
About the articles:
Sens D, Shilova L, Gräf L, Grebenshchikova M, Eskofier BM, Casale FP. 2024. Genetics-driven risk predictions leveraging the Mendelian randomization framework. Genome Res. doi: 10.1101/gr.279252.124.
Hong MM, Froelicher D, Magner R, Popic V, Berger B, Cho H. 2024. Secure discovery of genetic relatives across large-scale and distributed genomic data sets. Genome Res. doi: 10.1101/gr.279057.124.
Zhu K, Jones MG, Luebeck J, Bu X, Yi H, Hung KL, Wong IT, Zhang S, Mischel PS, Chang HY, Bafna V. 2024. CoRAL accurately resolves extrachromosomal DNA genome structures with long-read sequencing. Genome Res. doi: 10.1101/gr.279131.124
Giurgiu M, Wittstruck N, Rodriguez-Fos E, Chamorro Gonzalez R, Brueckner L, Krienelke-Szymansky A, Helmsauer K, Hartebrodt A, Euskirchen P, Koche RP, Haase K, Reinert K, Henssen AG. 2024. Reconstructing extrachromosomal DNA structural heterogeneity from long-read sequencing data using Decoil. Genome Res. doi: 10.1101/gr.279123.124.
Park C, Mani S, Beltran-Velez N, Maurer K, Gohil S, Li S, Huang T, Knowles DA, Wu CJ, Azizi E. 2024. A Bayesian framework for inferring dynamic intercellular interactions from time-series single-cell data. Genome Res. doi: 10.1101/gr.279126.124.
Schrod S, Lück N, Lohmayer R, Solbrig S, Völkl D, Wipfler T, Shutta KH, Ben Guebila M, Schäfer A, Beißbarth T, Zacharias HU, Oefner P, Quackenbush J, Altenbuchinger M. 2024. Spatial Cellular Networks from omics data with SpaCeNet. Genome Res. doi: 10.1101/gr.279125.124.
Azizpour A, Balaji A, Treangen TJ, Segarra S. 2024. Graph-based self-supervised learning for repeat detection in metagenomic assembly. Genome Res. doi: 10.1101/gr.279136.124.
In addition to the articles highlighted above, the following will also appear in the issue:
Burch M, Bose A, Dexter G, Parida L, Drineas P. 2024. Matrix sketching framework for linear mixed models in association studies. Genome Res. doi:10.1101/gr.279230.124
Chandra G, Gibney D, Jain C. 2024. Haplotype-aware sequence alignmentto pangenome graphs. Genome Res. doi:10.1101/gr.279143.124
Fu B, Anand P, Anand A, Mefford J, Sankararaman S. 2024. A scalable adaptivequadratic kernel method for interpretable epistasis analysis in complex traits. Genome Res. doi:10.1101/gr.279140.124
Goldenberg M, Mualem L, Shahar A, Snir S, Akavia A. 2024. Privacy-preserving biological age prediction over federated human methylation data using fully homomorphic encryption. Genome Res. doi:10.1101/gr.279071.124
Iovino BG, Tang H, Ye Y. 2024. Protein domain embeddings for fast and accurate similarity search. Genome Res. doi:10.1101/gr.279127.124
Jeong M, Pazokitoroudi A, Liu Z, Sankararaman S. 2024. Scalable summarystatistics- based heritability estimation method with individual genotype level accuracy. Genome Res. doi:10.1101/gr.279207.124
Lal A, Garfield D, Biancalani T, Eraslan G. 2024. Designing realistic regulatory DNA with autoregressive language models. Genome Res.doi:10.1101/gr.279142.124
Li L, Dannenfelser R, Cruz C, Yao V. 2024. A best-match approach for gene set analyses in embedding spaces. Genome Res. doi:10.1101/gr.279141.124
Saha E, Fanfani V, Mandros P, Ben-Guebila M, Fischer J, Shutta KH, DeMeo DL, Lopes-Ramos CM, Quackenbush J. 2024. Bayesian inference of sample-specific coexpression networks. Genome Res. doi:10.1101/gr.279117.124
Şapcı AOB, Mirarab S. 2024. Memory-bound k-mer selection for large and evolutionary diverse reference libraries. Genome Res. doi:10.1101/gr.279339.124
Yang J, Yen K, Mahony S. 2024. Size-based expectation maximization for characterizing nucleosome positions and subtypes. Genome Res. doi:10.1101/gr.279138.124
Zahin T, Shi Q, Zang XC, Shao M. 2024. Accurate assembly of circular RNAs with TERRACE. Genome Res. doi:10.1101/gr.279106.124
Zeng S, Wang D, Jiang L, Xu D. 2024. Parameter-efficient fine-tuning on large protein language models improves signal peptide prediction. Genome Res. doi:10.1101/gr.279132.124
###
About Genome Research:
Launched in 1995, Genome Research (www.genome.org) is an international, continuously published, peer-reviewed journal that focuses on research that provides novel insights into the genome biology of all organisms, including advances in genomic medicine. Among the topics considered by the journal are genome structure and function, comparative genomics, molecular evolution, genome-scale quantitative and population genetics, proteomics, epigenomics, and systems biology. The journal also features exciting gene discoveries and reports of cutting-edge computational biology and high-throughput methodologies.
About Cold Spring Harbor Laboratory Press:
Cold Spring Harbor Laboratory Press is an internationally renowned publisher of books, journals, and electronic media, located on Long Island, New York. Since 1933, it has furthered the advance and spread of scientific knowledge in all areas of genetics and molecular biology, including cancer biology, plant science, bioinformatics, and neurobiology. The Press is a division of Cold Spring Harbor Laboratory, an innovator in life science research and the education of scientists, students, and the public. For more information, visit our website at http://cshlpress.org.
Genome Research issues press releases to highlight significant research studies that are published in the journal.
END