PRESS-NEWS.org - Press Release Distribution
FREE PRESS RELEASES DISTRIBUTION

Could all your digital photos be stored as DNA?

A technique for labeling and retrieving DNA data files from a large pool could help make DNA data storage feasible

2021-06-10
(Press-News.org) CAMBRIDGE, MA -- On Earth right now, there are about 10 trillion gigabytes of digital data, and every day, humans produce emails, photos, tweets, and other digital files that add up to another 2.5 million gigabytes of data. Much of this data is stored in enormous facilities known as exabyte data centers (an exabyte is 1 billion gigabytes), which can be the size of several football fields and cost around $1 billion to build and maintain.

Many scientists believe that an alternative solution lies in the molecule that contains our genetic information: DNA, which evolved to store massive quantities of information at very high density. A coffee mug full of DNA could theoretically store all of the world's data, says Mark Bathe, an MIT professor of biological engineering.

"We need new solutions for storing these massive amounts of data that the world is accumulating, especially the archival data," says Bathe, who is also an associate member of the Broad Institute of MIT and Harvard. "DNA is a thousandfold denser than even flash memory, and another property that's interesting is that once you make the DNA polymer, it doesn't consume any energy. You can write the DNA and then store it forever."

Scientists have already demonstrated that they can encode images and pages of text as DNA. However, an easy way to pick out the desired file from a mixture of many pieces of DNA will also be needed. Bathe and his colleagues have now demonstrated one way to do that, by encapsulating each data file into a 6-micrometer particle of silica, which is labeled with short DNA sequences that reveal the contents.

Using this approach, the researchers demonstrated that they could accurately pull out individual images stored as DNA sequences from a set of 20 images. Given the number of possible labels that could be used, this approach could scale up to 1020 files.

Bathe is the senior author of the study, which appears today in Nature Materials. The lead authors of the paper are MIT senior postdoc James Banal, former MIT research associate Tyson Shepherd, and MIT graduate student Joseph Berleant.

Stable storage

Digital storage systems encode text, photos, or any other kind of information as a series of 0s and 1s. This same information can be encoded in DNA using the four nucleotides that make up the genetic code: A, T, G, and C. For example, G and C could be used to represent 0 while A and T represent 1.

DNA has several other features that make it desirable as a storage medium: It is extremely stable, and it is fairly easy (but expensive) to synthesize and sequence. Also, because of its high density -- each nucleotide, equivalent to up to two bits, is about 1 cubic nanometer -- an exabyte of data stored as DNA could fit in the palm of your hand.

One obstacle to this kind of data storage is the cost of synthesizing such large amounts of DNA. Currently it would cost $1 trillion to write one petabyte of data (1 million gigabytes). To become competitive with magnetic tape, which is often used to store archival data, Bathe estimates that the cost of DNA synthesis would need to drop by about six orders of magnitude. Bathe says he anticipates that will happen within a decade or two, similar to how the cost of storing information on flash drives has dropped dramatically over the past couple of decades.

Aside from the cost, the other major bottleneck in using DNA to store data is the difficulty in picking out the file you want from all the others.

"Assuming that the technologies for writing DNA get to a point where it's cost-effective to write an exabyte or zettabyte of data in DNA, then what? You're going to have a pile of DNA, which is a gazillion files, images or movies and other stuff, and you need to find the one picture or movie you're looking for," Bathe says. "It's like trying to find a needle in a haystack."

Currently, DNA files are conventionally retrieved using PCR (polymerase chain reaction). Each DNA data file includes a sequence that binds to a particular PCR primer. To pull out a specific file, that primer is added to the sample to find and amplify the desired sequence. However, one drawback to this approach is that there can be crosstalk between the primer and off-target DNA sequences, leading unwanted files to be pulled out. Also, the PCR retrieval process requires enzymes and ends up consuming most of the DNA that was in the pool.

"You're kind of burning the haystack to find the needle, because all the other DNA is not getting amplified and you're basically throwing it away," Bathe says.

File retrieval

As an alternative approach, the MIT team developed a new retrieval technique that involves encapsulating each DNA file into a small silica particle. Each capsule is labeled with single-stranded DNA "barcodes" that correspond to the contents of the file. To demonstrate this approach in a cost-effective manner, the researchers encoded 20 different images into pieces of DNA about 3,000 nucleotides long, which is equivalent to about 100 bytes. (They also showed that the capsules could fit DNA files up to a gigabyte in size.)

Each file was labeled with barcodes corresponding to labels such as "cat" or "airplane." When the researchers want to pull out a specific image, they remove a sample of the DNA and add primers that correspond to the labels they're looking for -- for example, "cat," "orange," and "wild" for an image of a tiger, or "cat," "orange," and "domestic" for a housecat.

The primers are labeled with fluorescent or magnetic particles, making it easy to pull out and identify any matches from the sample. This allows the desired file to be removed while leaving the rest of the DNA intact to be put back into storage. Their retrieval process allows Boolean logic statements such as "president AND 18th century" to generate George Washington as a result, similar to what is retrieved with a Google image search.

"At the current state of our proof-of-concept, we're at the 1 kilobyte per second search rate. Our file system's search rate is determined by the data size per capsule, which is currently limited by the prohibitive cost to write even 100 megabytes worth of data on DNA, and the number of sorters we can use in parallel. If DNA synthesis becomes cheap enough, we would be able to maximize the data size we can store per file with our approach," Banal says.

For their barcodes, the researchers used single-stranded DNA sequences from a library of 100,000 sequences, each about 25 nucleotides long, developed by Stephen Elledge, a professor of genetics and medicine at Harvard Medical School. If you put two of these labels on each file, you can uniquely label 1010 (10 billion) different files, and with four labels on each, you can uniquely label 1020 files.

Bathe envisions that this kind of DNA encapsulation could be useful for storing "cold" data, that is, data that is kept in an archive and not accessed very often. His lab is spinning out a startup, Cache DNA, that is now developing technology for long-term storage of DNA, both for DNA data storage in the long-term, and clinical and other preexisting DNA samples in the near-term.

"While it may be a while before DNA is viable as a data storage medium, there already exists a pressing need today for low-cost, massive storage solutions for preexisting DNA and RNA samples from Covid-19 testing, human genomic sequencing, and other areas of genomics," Bathe says.

INFORMATION:

The research was funded by the Office of Naval Research, the National Science Foundation, and the U.S. Army Research Office.



ELSE PRESS RELEASES FROM THIS DATE:

NYUAD study offers new insight into one of the mysteries of natural immunity to malaria

NYUAD study offers new insight into one of the mysteries of natural immunity to malaria
2021-06-10
Fast Facts: - In 2019, 409,000 people died of malaria--most were young children in sub-Saharan Africa. - The annual economic costs of malaria to Africa alone amount to USD 12 billion. - Through extensive fieldwork and close follow-up of the children in rural areas of Burkina Faso, the new study has led to the discovery of a molecular mechanism that alters the immune response to infection. Abu Dhabi, UAE - June 10, 2021: In the first and largest global metabolomic study of African children before and after malaria infection, NYU Abu Dhabi Assistant Professor of Biology Youssef Idaghdour and his colleagues at the Centre National de Recherche et de Formation sur le Paludisme ...

Declining growth rates of global coral reef ecosystems

Declining growth rates of global coral reef ecosystems
2021-06-10
If the trend of declining coral growth continues at the current rate, the world's coral reefs may cease calcifying around 2054, a new Southern Cross University study has found. Drawing on research from the late 1960s until now, the paper published in Communications & Environment reveals the global spatiotemporal trends and drivers of coral reef ecosystem growth (known as calcification). One hundred and sixteen studies from 53 published papers were analysed. "It is known that coral reefs have been degrading over time. Our study relies on historical data to quantify the current rate of decline and indicates what could be happening in the future," said project ...

COVID-19 vaccine hesitancy among patients with breast cancer

2021-06-10
What The Study Did: This survey study among women with breast cancer in Mexico evaluates their specific concerns about and high hesitancy rate toward COVID-19 vaccination. Authors: Cynthia Villarreal-Garza, M.D., D.Sc., of the Breast Cancer Center, Hospital Zambrano Hellion TecSalud in Nuevo León, Mexico, is the corresponding author. To access the embargoed study: Visit our For The Media website at this link https://media.jamanetwork.com/  (doi:10.1001/jamaoncol.2021.1962) Editor's Note: The article includes conflict of interest disclosures. Please see the article for additional information, including other authors, author contributions and affiliations, conflict of interest and financial disclosures, and funding and support. INFORMATION: Media advisory: The ...

Incidence of multisystem inflammatory syndrome in children among people with SARS-CoV-2 infection in US

2021-06-10
What The Study Did: The incidence of multisystem inflammatory syndrome in children (MIS-C) among people with SARS-CoV-2 infection in the United States was estimated in this study. Authors: Angela P. Campbell, M.D., M.P.H., of the COVID-19 Response Team at the U.S. Centers for Disease Control and Prevention in Atlanta, was the corresponding author. To access the embargoed study: Visit our For The Media website at this link https://media.jamanetwork.com/ (doi:10.1001/jamanetworkopen.2021.16420) Editor's Note: The article includes conflict of interest and funding/support ...

Hush little baby don't say a word...

Hush little baby dont say a word...
2021-06-10
Children with documented child protection concerns are four times as likely to die before they reach their 16th birthday, according to confronting new research from the University of South Australia. The world first study identifies the extreme seriousness of familial child abuse and neglect, measuring for the first time the excess risk of death that children with documented child protection concerns face. Child abuse and neglect are prominent worldwide public health concerns affecting 20-50 per cent of children worldwide. In Australia, 20-25 per cent of children endure child maltreatment. Published in JAMA Network Open today, the study analysed deidentified data ...

Combination targeted therapy provides durable remission for patients with CLL

2021-06-10
A combination of ibrutinib and venetoclax was found to provide lasting disease remission in patients with newly diagnosed chronic lymphocytic leukemia (CLL), according to researchers at The University of Texas MD Anderson Cancer Center. Findings from the single-institution Phase II study were published today in JAMA Oncology and provide the longest follow-up data on patients treated with this drug regimen. Lead researchers included Nitin Jain, M.D., associate professor of Leukemia, William Wierda, M.D., Ph.D., professor of Leukemia; and Varsha Gandhi, Ph.D., department chair ad interim of Experimental Therapeutics. MD Anderson researchers previously reported results from this study showing that ibrutinib ...

Microscopic imaging without a microscope?

2021-06-10
The 30,000 or so genes making up the human genome contain the instructions vital to life. Yet each of our cells expresses only a subset of these genes in their daily functioning. The difference between a heart cell and a liver cell, for example, is determined by which genes are expressed--and the correct expression of genes can mean the difference between health and disease. Until recently, researchers investigating the genes underlying disease have been limited because traditional imaging techniques only allow for the study of a handful of genes at a time. A new technique developed by Jun Hee Lee, Ph.D., and his team at the University of Michigan Medical School, part of Michigan Medicine, uses high-throughput ...

Could neutrophils be the secret to cancer's Achilles' heel?

2021-06-10
A study published in the June 10, 2021 issue of Cell describes a remarkable new mechanism by which the body's own immune system can eliminate cancer cells without damaging host cells. The findings have the potential to develop first-in-class medicines that are designed to be selective for cancer cells and non-toxic to normal cells and tissues. If successful, this discovery may improve the practice of precision medicine by ensuring the right drug is delivered at the right dose at the right time. Our immune system plays a critical role in our ability to fight off diseases ...

Tuning the energy gap: A novel approach for organic semiconductors

Tuning the energy gap: A novel approach for organic semiconductors
2021-06-10
Organic semiconductors have earned a reputation as energy efficient materials in organic light emitting diodes (OLEDs) that are employed in large area displays. In these and in other applications, such as solar cells, a key parameter is the energy gap between electronic states. It determines the wavelength of the light that is emitted or absorbed. The continuous adjustability of this energy gap is desirable. Indeed, for inorganic materials an appropriate method already exists - the so-called blending. It is based on engineering the band gap by substituting atoms in the material. This allows for a continuous tunability as, for example in aluminum gallium arsenide semiconductors. Unfortunately, this is not transferable to organic semiconductors ...

Soot from heaters and traffic is not just a local problem

Soot from heaters and traffic is not just a local problem
2021-06-10
Leipzig. Soot particles from oil and wood heating systems as well as road traffic can pollute the air in Europe on a much larger scale than previously assumed. This is what researchers from the Leibniz Institute for Tropospheric Research (TROPOS) conclude from a measurement campaign in the Thuringian Forest in Germany. The evaluation of the sources showed that about half of the soot particles came from the surrounding area and the other half from long distances. From the researchers' point of view, this underlines the need to further reduce emissions of soot that ...

LAST 30 PRESS RELEASES:

Blackologists and the Promise of Inclusive Sustainability

Robot-assisted surgery: Putting the reality in virtual reality

Novel interactions between proteins that help in recovering from brain injury

Common antibiotic found useful in accelerating recovery in tuberculosis patients

The 'Mozart effect' shown to reduce epileptic brain activity, new research reveals

Study examines heart and kidney outcomes of adults with nephrotic syndrome

Study examines symptoms before and after kidney transplantation

New research adds a wrinkle to our understanding of the origins of matter in the Milky Way

Stronger together: how protein filaments interact

New study uncovers details behind the body's response to stress

Carcinogen-exposed cells provide clues in fighting treatment-resistant cancers

Memory helps us evaluate situations on the fly, not just recall the past

Animals' ability to adapt their habitats key to survival amid climate change

Undiagnosed and untreated disease identified in rural South Africa

Study reveals new therapeutic target for C. difficile infection

New artificial heart shows promising results in 'auto-mode' -- initial clinical experience reported in ASAIO Journal

Picky neurons

Does cannabis affect brain development in young people with ADHD? Too soon to tell, reports Harvard Review of Psychiatry

Researchers find optimal way to pay off student loans

Use rewards effectively to boost creativity

Researchers find losartan is not effective in reducing hospitalization from mild COVID-19

Scientists detect signatures of life remotely

Team describes science-based hiccups intervention

Princeton-led team discovers unexpected quantum behavior in kagome lattice

Overcoming a newly recognized form of resistance to modern prostate cancer drugs

Will reduction in tau protein protect against Parkinson's and Lewy body dementias?

The end of Darwin's nightmare at Lake Victoria?

Study: Men doing more family caregiving could lower their risk of suicide

Researchers dig deeper into how cells transport their waste for recycling

Organic farming could feed Europe by 2050

[Press-News.org] Could all your digital photos be stored as DNA?
A technique for labeling and retrieving DNA data files from a large pool could help make DNA data storage feasible