(Press-News.org) One in every 10 people worldwide is impacted by a rare genetic disease but about 50% of them remain undiagnosed despite rapid increases in genetic technology and testing. Even when a person does have access to testing, the process of getting a diagnosis can take about five years or more, which is sometimes too late for patients, who are often children, to start the right treatment.
This is partly because current clinical testing uses a method called short-read sequencing, which cannot access information in certain regions of the genome and so may miss crucial evidence to help make a diagnosis. But UC Santa Cruz researchers are pushing forward research on a cutting-edge alternative method, called long-read sequencing, which can provide a more comprehensive dataset for finding variation, eliminate the need for multiple specialized tests, and streamline the diagnosis of rare diseases.
A new study shows that long-read sequencing has the potential to improve the rate of diagnosis while reducing the time to diagnosis from years to days — in a single test and at a much lower cost. The study was published in The American Journal of Human Genetics and led by core members of the UCSC Genomics Institute Professor of Biomolecular Engineering (BME) Benedict Paten and Associate Professor of BME Karen Miga, as well as former UCSC postdoctoral scholar Jean Monlong.
“Rare diseases are something that people have been struggling to diagnose for so many years, and if we have a sequencing technology which streamlines diagnostic testing, I think that will be a huge contribution — and that is what we tested as part of this paper,” said Shloka Negi, a UC Santa Cruz BME Ph.D. student who is the paper’s first author.
“Today, the diagnostic yield of genetic sequencing is frustratingly low,” Paten said. “One likely cause is the incomplete sequencing methods used in clinical practice. In this work, we test the hypothesis that new, more comprehensive long-read sequencing can generate additional information useful for genetic diagnosis. We were excited to discover numerous additional potentially interesting genetic variants and epigenetic signals in our cohort. While it is still early days, there is great promise in this information, and it will take time for the community to interpret and fully understand much of this new information.”
Finding rare disease
This study focused on rare monogenic diseases, which are those caused by a disruption to a single gene.
Scientists diagnose genetic diseases by searching through their genetic material to find variants — differences in a gene that may prevent it from functioning properly. The typical approach for finding these variants uses a technique called short-read sequencing, which reads the genetic base pairs — combinations of adenine (A), cytosine (C), guanine (G), and thymine (T) — in sequences of about 150-250 at a time.
The limitation of short-read sequencing, however, is that it can miss crucial information in certain regions of the genome, like patterns of base pairs that are much longer than just 250 base pairs. It also can’t perform “phasing,” the process of determining which variants are inherited from the mother and which are from the father. This can help clinicians discover from whom variants are inherited; for example if two variants are inherited from the same parent, one from each parent, or not inherited at all. This can be a very useful piece of information for genetic diagnoses, especially when parental data is not available.
In contrast, long-read sequencing can read lengthy stretches of DNA at once, eliminating gaps that may lead scientists and clinicians to miss important information about gene variation. Long-read sequencing also provides direct phasing data as well as information about methylation, a chemical process in DNA that causes genes to be “turned on or off,” and can contribute to disease.
“Long-read sequencing is going to be a lot better in certain cases, and we are taking steps to prove that,” Negi said.
Leading in methods
UC Santa Cruz Genomics Institute researchers have a rich history of innovation and expertise in long-read sequencing and are actively developing methods to optimize sequencing and analysis for a wide range of health research applications. Many of the techniques researchers developed to achieve feats, such as the first truly complete “telomere-to-telomere” reference genome, are now being used to improve patient outcomes.
“Reinforcing earlier findings, we found that the benefits of using long-read sequencing were increased substantially by using a complete, so-called ‘telomere-to-telomere’ reference genome in place of the existing incomplete but widely used genomic reference,” Miga said. “We anticipate that pangenomes — references that represent diverse human variation — will extract even more benefit from new long-read sequencing technologies.”
Paten and Miga’s labs partnered with clinicians to work on the cases of 42 patients with rare diseases — some of whom received a diagnosis via short-read methods or other specialized testing, and some of whom were still undiagnosed. In some cases, the researchers had access to parental genetic information, but in others, they did not.
Long-read sequencing of the patients was led by the Miga Lab using nanopore sequencing, a method for long-read sequencing pioneered at UCSC, to achieve highly accurate, end-to-end reads of the patients’ genomes for about $1,000 per sample.
The genomic data was analyzed using computational methods developed in Paten’s lab to find small and large variants, phasing data, and methylation data, all using one pipeline called the Napu pipeline. The analysis process takes around a day or less, depending on the computer processing speed, and costs $100.
Solving cases
After sequencing and analyzing the patient data, the researchers found that long-reads provided a more exhaustive dataset as compared to what can be derived with short-read sequencing.
Long-read sequencing delivered conclusive diagnosis for 11 of the 42 patients in the cohort, providing everything that was known from the short-read data as well as additional information, including additional rare candidate variants, long-range phasing, and methylation — all in a single, cost-efficient, and rapid protocol.
The 11 diagnosed cases included four of congenital adrenal hypoplasia (a rare condition where the adrenal glands are enlarged and fail to function properly). The gene responsible for this disease is in a particularly challenging region of the genome — it can't be characterized with short read sequencing technology, and the current clinical test is cumbersome and incomplete.
“To solve these cases, we developed a new pangenomic tool that integrates new high-quality assemblies like the 'telomere-to-telomere' reference genome,” said Monlong, who began this project as a postdoctoral scholar in Paten’s lab and continued in his current position at INSERM in France. “We were excited to see that we could find and phase the pathogenic variants of all four patients suffering from this disease in our cohort. In the future, it might offer a rapid and comprehensive clinical test. We know many rare diseases involve regions of the human genome that have been historically difficult to study, so our results encourage us to extend our approach to more of those diseases that have been at a standstill for a long time.”
In addition, two cases involved disorders of sex development, while one rare case of Leydig cell hypoplasia affected male sexual development due to underdeveloped Leydig cells in the testes. Additionally, four cases of neurodevelopmental disorders, each representing long and challenging diagnostic odysseys, were finally resolved.
“Long read sequencing is likely the next best test for unsolved cases with either compelling variants in a single gene or a clear phenotype,” Negi said. “It can serve as a single diagnostic test, reducing the need for multiple clinical visits and transforming a years-long diagnostic journey into a matter of hours.”
On average, each patient had 280 genes (including some Mendelian disease genes, which are linked to inherited disorders caused by single-gene mutations) with significant protein-coding regions uniquely covered by long reads and undetected by short reads.
“There’s so much more of the genome that the long reads can unlock,” Negi said. “But, it will take some time until we can fully interpret this new information revealed by long reads. This data has been absent from our clinical databases, which were built using short-read analysis and mapping to the standard reference. We showed that long reads are uncovering about 5.8% more of the telomere-to-telomere genome that short reads simply couldn’t access.”
Other UC Santa Cruz researchers involved in this research include Brandy McNulty, Ivo Violich, Joshua Gardner, Todd Hillaker, and Sara O’Rourke.
This research was funded in part by the Chan Zuckerberg Initiative.
END
Long read sequencing reveals more genetic information while cutting time and cost of rare disease diagnoses
2025-01-24
ELSE PRESS RELEASES FROM THIS DATE:
AAAS and ASU launch mission-driven collaborative to strengthen scientific enterprise
2025-01-24
Today, the American Association for the Advancement of Science and Arizona State University announced a five-year partnership, the AAAS + ASU Collaborative. Together, the institutions will elevate and amplify strategies and practices that advance scientific excellence and enable a boldly inclusive scientific enterprise serving society.
In its first phase, the Collaborative includes a joint prize, an invitation for the ASU STEMM community to join AAAS as Elemental Members, and events in Washington, D.C., addressing policy-relevant science topics.
“Focusing science and scientific advances on the challenges we face is essential to the advancement ...
Medicaid-insured heart transplant patients face higher risk of post-transplant complications
2025-01-24
A new study led by UCLA Health highlights the link between socioeconomic disadvantage, Medicaid insurance, and poorer survival rates after heart transplantation. Researchers found that Medicaid-insured heart transplant patients had a higher likelihood of developing cardiac allograft vasculopathy (CAV), a condition that affects transplanted hearts and can limit long-term survival. It has been reported that CAV contributes to more than 30% of all deaths in the first 5 to 10 years following heart transplantation.
The study, which included heart transplant recipients aged 18 and older, divided ...
Revolutionizing ammonia synthesis: New iron-based catalyst surpasses century-old benchmark
2025-01-24
NH3 is one of the most important chemicals in today’s world, as it is used in the production of fertilizers to boost agricultural yields and sustain the ever-growing global population. For over 100 years, NH3 production has relied on the Haber–Bosch (HB) process, which combines nitrogen (N2) and hydrogen in the presence of a catalyst. Interestingly, an iron-based catalyst developed a century ago (called ‘Promoted-Fe’) still remains at the forefront of mass NH3 production, despite countless efforts to find more energy-efficient alternatives. In the HB process, where NH3 is produced by a catalyst-filled reactor with a limited volume, ...
A groundbreaking approach: Researchers at The University of Texas at San Antonio chart the future of neuromorphic computing
2025-01-24
A review article about the future of neuromorphic computing by a team of 23 researchers, including two authors from UTSA, was published today in Nature. Dhireesha Kudithipudi, the Robert F. McDermott Endowed Chair in Engineering and founding director of MATRIX: The UTSA AI Consortium for Human Well-Being, served as the lead author, while Tej Pandit, a UTSA doctoral candidate in computer engineering, is one of the co-authors. The review article, titled “Neuromorphic Computing at Scale,” examines the state of neuromorphic technology and presents a strategy for building large-scale neuromorphic systems.
The research is part of a broader effort ...
Long COVID, Italian scientists discovered the molecular ‘fingerprint’ of the condition in children's blood
2025-01-24
One day Long Covid in children could be objectively diagnosed with a blood test, thanks to the help of Artificial Intelligence (AI). In fact, a study by the Università Cattolica del Sacro Cuore, Rome campus - Fondazione Policlinico Universitario Agostino Gemelli IRCCS and the Ospedale Pediatrico Bambino Gesù IRCCS, has highlighted the molecular signature of Long Covid in plasma in paediatric age and used an AI tool capable of making the diagnosis based on the results of the blood sample, with 93% ...
Battery-powered electric vehicles now match petrol and diesel counterparts for longevity
2025-01-24
Battery-powered electric vehicles are now more reliable and can match the lifespans of traditional cars and vans with petrol and diesel engines - marking a pivotal moment in the drive towards sustainable transportation, a new study reveals.
Researchers used nearly 300 million UK Ministry of Transport (MOT) test records charting the ‘health’ of every vehicle on the United Kingdom’s roads between 2005 and 2022 to estimate vehicle longevity and provide a comprehensive analysis of survival rates for different powertrains.
The international research ...
MIT method enables protein labeling of tens of millions of densely packed cells in organ-scale tissues
2025-01-24
A new technology developed at MIT enables scientists to label proteins across millions of individual cells in fully intact 3D tissues with unprecedented speed, uniformity, and versatility. Using the technology, the team was able to richly label whole rodent brains and other large tissue samples in a single day. In their new study in Nature Biotechnology, they also demonstrate that the ability to label proteins with antibodies at the single-cell level across whole brains can reveal insights left hidden by other widely used labeling methods.
Profiling the proteins that cells are making is a staple of studies in biology, neuroscience and related fields because the ...
Calculating error-free more easily with two codes
2025-01-24
Computers also make mistakes. These are usually suppressed by technical measures or detected and corrected during the calculation. In quantum computers, this involves some effort, as no copy can be made of an unknown quantum state. This means that the state cannot be saved multiple times during the calculation and an error cannot be detected by comparing these copies. Inspired by classical computer science, quantum physics has developed a different method in which the quantum information is distributed across several entangled quantum bits and stored redundantly in this ...
Dissolving clusters of cancer cells to prevent metastases
2025-01-24
Certain tumour types do not remain at their point of origin but spread throughout the body and form metastases. This is because the primary tumour continuously releases cancer cells into the blood. These circulating tumour cells (CTCs) can join together into small clusters of up to a dozen cells and settle in other organs. There, the clusters grow into larger tumours, known as metastases. Metastatic tumours are still a major medical problem: every year, around seven million people worldwide die from them.
One example of such a spreading tumour is breast cancer. As soon ...
A therapeutic HPV vaccine could eliminate precancerous cervical lesions
2025-01-24
PHILADELPHIA – A therapeutic vaccine targeting human papillomavirus type 16 (HPV16) induced regression in high-grade precancerous cervical lesions, according to the results from a phase II clinical trial published in Clinical Cancer Research, a journal of the American Association for Cancer Research.
“Nearly all premalignant cervical lesions and cervical cancers are caused by HPV infection, with HPV16 implicated in the majority of cases,” said Refika Yigit, MD, principal investigator and oncological gynecologist at University Medical Centre Groningen in the ...