Bioinformatics tool accurately tracks synthetic DNA
Computer scientists show benefits of bioinformatics with PlasmidHawk
2021-02-26
(Press-News.org) HOUSTON - (Feb. 26, 2021) - Tracking the origin of synthetic genetic code has never been simple, but it can be done through bioinformatic or, increasingly, deep learning computational approaches.
Though the latter gets the lion's share of attention, new research by computer scientist Todd Treangen of Rice University's Brown School of Engineering is focused on whether sequence alignment and pan-genome-based methods can outperform recent deep learning approaches in this area.
"This is, in a sense, against the grain given that deep learning approaches have recently outperformed traditional approaches, such as BLAST," he said. "My goal with this study is to start a conversation about how to combine the expertise of both domains to achieve further improvements for this important computational challenge."
Treangen, who specializes in developing computational solutions for biosecurity and microbial forensics applications, and his team at Rice have introduced PlasmidHawk, a bioinformatics approach that analyzes DNA sequences to help identify the source of engineered plasmids of interest.
"We show that a sequence alignment-based approach can outperform a convolutional neural network (CNN) deep learning method for the specific task of lab-of-origin prediction," he said.
The researchers led by Treangen and lead author Qi Wang, a Rice graduate student, reported their results in an open-access paper in Nature Communications.
The open-source software is available here: https://gitlab.com/treangenlab/plasmidhawk.
The program may be useful not only for tracking potentially harmful engineered sequences but also for protecting intellectual property.
"The goal is either to help protect intellectual property rights of the contributors of the sequences or help trace the origin of a synthetic sequence if something bad does happen," Treangen said.
Treangen noted a recent high-profile paper describing a recurrent neural network (RNN) deep learning technique to trace the originating lab of a sequence. That method achieved 70% accuracy in predicting the single lab of origin. "Despite this important advance over the previous deep learning approach, PlasmidHawk offers improved performance over both methods," he said.
The Rice program directly aligns unknown strings of code from genome data sets and matches them to pan-genomic regions that are common or unique to synthetic biology research labs
"To predict the lab-of-origin, PlasmidHawk scores each lab based on matching regions between an unclassified sequence and the plasmid pan-genome, and then assigns the unknown sequence to a lab with the minimum score," Wang said.
In the new study, using the same dataset as one of the deep learning experiments, the researchers reported the successful prediction of "unknown sequences' depositing labs" 76% of the time. They found that 85% of the time the correct lab was in the top 10 candidates.
Unlike the deep learning approaches, they said PlasmidHawk requires reduced pre-processing of data and does not need retraining when adding new sequences to an existing project. It also differs by offering a detailed explanation for its lab-of-origin predictions in contrast to the previous deep learning approaches.
"The goal is to fill your computational toolbox with as many tools as possible," said co-author Ryan Leo Elworth, a postdoctoral researcher at Rice. "Ultimately, I believe the best results will combine machine learning, more traditional computational techniques and a deep understanding of the specific biological problem you are tackling."
INFORMATION:
Rice graduate students Bryce Kille and Tian Rui Liu are co-authors of the paper. Treangen is an assistant professor of computer science.
The research was supported by the National Institutes of Health via the National Institute for Neurological Disorders and Stroke, the Office of the Director of National Intelligence and the Army Research Office. Addgene provided access to the DNA sequences of the deposited plasmids.
Read the abstract at http://dx.doi.org/10.1038/s41467-021-21180-w.
This news release can be found online at https://news.rice.edu/2021/02/26/bioinformatics-tool-accurately-tracks-synthetic-dna/
Follow Rice News and Media Relations via Twitter @RiceUNews.
Related materials:
Mitochondrial stress 'ages' astronauts: http://news.rice.edu/2020/12/02/mitochondrial-stress-ages-astronauts/
Flood of genome data hinders efforts to ID bacteria: http://news.rice.edu/2018/10/30/flood-of-genome-data-hinders-efforts-to-id-bacteria-2/
Treangen Lab: https://sites.google.com/view/treangen/home
Rice Department of Computer Science: https://csweb.rice.edu
George R. Brown School of Engineering: https://engineering.rice.edu
Image for download:
https://news-network.rice.edu/news/files/2021/02/0221_PLASMID-1a-WEB.jpg
CAPTION: Todd Treangen. (Credit: Tommy LaVergne/Rice University)
Located on a 300-acre forested campus in Houston, Rice University is consistently ranked among the nation's top 20 universities by U.S. News & World Report. Rice has highly respected schools of Architecture, Business, Continuing Studies, Engineering, Humanities, Music, Natural Sciences and Social Sciences and is home to the Baker Institute for Public Policy. With 3,978 undergraduates and 3,192 graduate students, Rice's undergraduate student-to-faculty ratio is just under 6-to-1. Its residential college system builds close-knit communities and lifelong friendships, just one reason why Rice is ranked No. 1 for lots of race/class interaction and No. 1 for quality of life by the Princeton Review. Rice is also rated as a best value among private universities by Kiplinger's Personal Finance.
Jeff Falk
713-348-6775
jfalk@rice.edu
Mike Williams
713-348-6728
mikewilliams@rice.edu
[Attachments] See images for this press release:
ELSE PRESS RELEASES FROM THIS DATE:
2021-02-26
TORONTO, Feb. 26, 2021 - The maternal care of offspring is one of the behavioural drivers that has led some bee species to have an ever-expanding social life over the history of evolution, new research out of York University has found.
By virtue of being in a social group, the genome itself may respond by selecting more social rather than non-social genes. The behaviour and social environment come first, setting the stage for future molecular evolution.
In addition, the researchers have found that a similar genetic evolution happened independently in different species at different times, suggesting there is a unifying principle leading to the same social trait.
"There seems to be something about sociality specifically that is driving the genome to evolve in this way. It's a very ...
2021-02-26
COLUMBIA, Mo. -- Marilyn Rantz still remembers the day she got the call that her mother, whose health had been declining, had fallen and fractured her shoulder. After rushing to the hospital, her mother told her she didn't understand how she ended up on a helicopter pad after the traumatic incident. A nearby nurse told Rantz the noise from the MRI scanning tube had caused her frightened mother to mistakenly believe she had been airlifted to the hospital on a helicopter.
Determined to prevent avoidable hospitalizations, as well as the stress and panic that often comes along with the ambulance ride, Rantz, ...
2021-02-26
MEMPHIS, Tenn. - Extremely low birth weight (ELBW) infants with moderate to large patent ductus arteriosus (PDA) may benefit from transcatheter PDA closure (TCPC) in the first four weeks of life, according to research published by Le Bonheur Cardiologist Ranjit Philip, MD, and Medical Director of Interventional Cardiac Imaging and Interventional Catheterization Laboratory Shyam Sathanandam, MD. Early PDA closure may prevent early onset pulmonary vascular disease, promote growth and facilitate faster weaning off supplemental oxygen and ventilator support.
"The primary objective of this study was to describe changes in hemodynamics, ...
2021-02-26
The National Cancer Institute's Genomic Data Commons (GDC), launched in 2016 by then-Vice President Joseph Biden and hosted at the University of Chicago, has become one of the largest and most widely used resources in cancer genomics, with more than 3.3 petabytes of data from more than 65 projects and over 84,000 anonymized patient cases, serving more than 50,000 unique users each month.
In new papers published Feb. 22 in Nature Communications and Nature Genetics, the UChicago-based research team shares new details about the GDC, which is funded by the National Cancer Institute (NCI), via subcontract with the Frederick National Laboratory for Cancer Research, currently operated by ...
2021-02-26
Climate alone is not a driver for human behavior. The choices that people make in the face of changing conditions take place in a larger human context. And studies that combine insights from archaeologists and environmental scientists can offer more nuanced lessons about how people have responded -- sometimes successfully -- to long-term environmental changes.
One such study, from researchers at Washington University in St. Louis and the Chinese Academy of Sciences, shows that aridification in the central plains of China during the early Bronze Age did not cause population collapse, a result that highlights the importance of social ...
2021-02-26
URBANA, Ill. - Economists and urban planners generally agree that local pollution sources disproportionally impact racial minorities in the U.S. The reasons for this are largely unclear, but a University of Illinois study provides new insights into the issue.
"Our work finds experimental evidence that racial discrimination in the home-renting process actively sorts minority renters into neighborhoods with higher levels of pollution," says Peter Christensen, assistant professor in the Department of Agricultural and Consumer Economics (ACE) and an affiliate in Center for the Economics of Sustainability at University of Illinois.
Christensen and co-authors Ignacio Sarmiento-Barbieri, U of I, and Christopher Timmins of Duke University conducted an empirical ...
2021-02-26
Challenging the idea that older people with shorter life expectancies should rank lower in coronavirus immunization efforts, new UC Berkeley research shows that giving vaccine priority to those most at risk of dying from COVID-19 will save the maximum number of lives, and their potential or future years of life.
The findings, published Feb. 25 in the journal Proceedings of the National Academy of Sciences, address the ethical dilemma of who should be first in line for a limited supply of vaccine shots amid a contagion that so far has killed 500,000 in the United States and 2.4 million globally.
"Since older age is accompanied by falling life expectancy, it is widely assumed that means we're saving fewer years of life," said study lead author Joshua Goldstein, a UC Berkeley ...
2021-02-26
Scientists at UCL and the IIT -Istituto Italiano di Tecnologia (Italian Institute of Technology) have created a temporary tattoo with light-emitting technology used in TV and smartphone screens, paving the way for a new type of "smart tattoo" with a range of potential uses.
The technology, which uses organic light-emitting diodes (OLEDs), is applied in the same way as water transfer tattoos. That is, the OLEDs are fabricated on to temporary tattoo paper and transferred to a new surface by being pressed on to it and dabbed with water.
The researchers, who ...
2021-02-26
The investigation of Electron-Positron-Ion (EPI) plasma?--?a fully ionised gas of electrons and positrons that includes astrophysical plasmas like solar winds?--?has attracted a great deal of attention over the last twenty years. A new study published in EPJ D by Garston Tiofack, Faculty of Sciences, University of Marousa, Cameroon, and colleagues, assesses the dynamics of positron acoustic waves (PAWS) in EPI plasmas whilst under the influence of magnetic fields, or magnetoplasmas.
The authors studied the changes in PAWs using a framework of Korteweg-de Vries (KdV) and modified Korteweg-de Vries (mKdV) equations finding a former led to compressive positron acoustic solitary waves (PASWs), whilst the latter resulted in the same and additional rarefactive ...
2021-02-26
The total amount of data generated worldwide is expected to reach 175 ZB (Zettabytes; 1 ZB equals 1 billion Terabytes) by 2025. If 175 ZB were stored on Blu-ray disks, the disk stack would be 23 times the distance to the Moon. We face the urgent need to develop storage technologies that can accommodate this enormous amount of data.
The demand to store ever-increasing volumes of information has resulted in the widespread implementation of data centers for Big Data. These centers consume massive amounts of energy (about 3% of global electricity supply) and rely on magnetization-based ...
LAST 30 PRESS RELEASES:
[Press-News.org] Bioinformatics tool accurately tracks synthetic DNA
Computer scientists show benefits of bioinformatics with PlasmidHawk