Using DNA To Boost Digital Data Storage And Processing
17:17 minutes
You might be familiar with a gigabyte, one of the most popular units of measure for computer storage. A two-hour movie is 3 gigabytes on average, while your phone can probably store 256 gigabytes.
But did you know that your body also stores information in its own way?
We see this in DNA, which has the instructions needed for an organism to develop, survive, and reproduce. In computing storage terms, each cell of our body contains about 1.5 gigabytes worth of data. And with about 30 trillion cells in our bodies, we could theoretically store about 45 trillion gigabytes—also known as 45 zettabytes—which is equivalent to about one fourth of all the data in the world today.
Recently, a group of researchers was able to develop a technology that allows computer storage and processing using DNA’s ability to store information by turning genetic code into binary code. This technology could have a major impact on the way we do computing and digital storage.
To explain more about this technology, SciFri guest host Sophie Bushwick is joined by two professors from North Carolina State University’s Department of Chemical and Biomolecular Engineering, Dr. Albert Keung and Dr. Orlin Velev.
Dr. Albert Keung is an associate professor, University Faculty Scholar & Goodnight Distinguished Scholar in the Department of Chemical and Biomolecular Engineering at NC State University in Raleigh, North Carolina.
Dr. Orlin Velev is the S. Frank and Doris Culberson Distinguished Professor in the Department of Chemical and Biomolecular Engineering at NC State University in Raleigh, North Carolina.
SOPHIE BUSHWICK: This is Science Friday. I’m Sophie Bushwick. Your genetic code contains a ton of data. It has all the instructions your body needs to develop, survive, and reproduce. And DNA is an incredibly compact way to store that information. Each one of our cells contains the equivalent of about a gigabyte of DNA data. That might not seem like too much. In comparison, it takes about three gigabytes to store a two-hour movie.
But with an estimated 30 trillion cells in our bodies, all that DNA adds up to roughly 30 trillion gigabytes or 30 zettabytes of storage. That’s enough to encode roughly one fifth of all the data in the world today. In recent years, researchers have developed technologies that tap into DNA storage capabilities. By converting genetic code into binary code, they can do things like encoding a book or even all of Wikipedia in the form of DNA base pairs.
And now, researchers are going beyond storage and using DNA as the basis for computers. To explain more about this groundbreaking technology, I’m joined by two professors from North Carolina State University’s Department of Chemical and Biomolecular Engineering, Dr. Albert Keung and Dr. Orlin Velev. Welcome to Science Friday. Thank you so much for being here.
ALBERT KEUNG: Thank you, Sophie, for having us.
ORLIN VELEV: It is a pleasure to be on this very interesting discussion.
SOPHIE BUSHWICK: Thanks. And how does DNA store information?
ALBERT KEUNG: The simplest way to think about it would be, DNA has is a string of letters, A, C, and G. And so you can have any length string of letters that you want. And you can have many of these strings. The simplest way to convert the letters into binary, or zeros and ones, would be an a could be a 00, a T could be a 01, a G could be a 10, and a C could be a 11. And so you just go letter by letter and convert that into these digits.
SOPHIE BUSHWICK: And what about if you want to go beyond storage and use DNA to process information as well? How does that work?
ALBERT KEUNG: So this has actually been relatively active field for over 20 years, since Leonard Adleman first created the first computation with DNA. And there’s many different flavors of this computation. You could use enzymes that can recognize and chew up certain pieces of DNA that have certain sequences. There are types of computations that use interactions between different DNA molecules to bind or unbind each other, and execute logical operations that way. The past two decades have actually generated many creative versions of computation.
SOPHIE BUSHWICK: What about your work? What did your latest study focus on?
ALBERT KEUNG: There’s been two decades of work in DNA computation. And in parallel, but somewhat disjoint, there’s been also work on storing information in DNA. What we wanted to do was create something that was compatible with both storing and computation, basically, try to create a early full computer, something that we think could help spark the imagination of young scientists out there that might be thinking about getting into research, and engineering, and science.
And so our focus is really on, can we create something that can both store, but it’s also warm enough, kind of flexible enough, to be used dynamically for things like computation?
SOPHIE BUSHWICK: Can you give an example of a computation that it could be used for?
ALBERT KEUNG: Yeah. So one of the computations I found really fun was work over a decade ago from Princeton, where they computed a chess problem. And so this was one type of computation that we emulated and another was sudoku. These puzzles basically have kind of similar rules. So basically it’s asking where can you put different chess pieces on a chess board so that they don’t attack each other or don’t attack a certain piece.
Or in sudoku, where can you put zero, ones, twos, threes, so that only one digit shows up in a row column at any one time, or every row, every column adds up to six. Things like that. So you have certain puzzles, certain board configurations that you’re searching for.
SOPHIE BUSHWICK: And if we want to use DNA as a computer, it has to be able, like you said, to store this information and also process it. But some of the techniques you talked about for processing information, like using enzymes to chew up the DNA, that doesn’t seem to be possible if you want to also store the DNA. So how did you get around that problem?
ALBERT KEUNG: Exactly. I think that’s kind of the disconnect that we were trying to find a solution for. So we needed a way to preserve and anchor the DNA without giving up its high density, information density, but also make it so that you can access the data and compute upon it without destroying the database. So this is where we linked up with Orlin Velev’s group, who pioneered a nanomaterial that maybe he can tell you about.
The key discovery was that DNA adhered to this material stably, but allowed enzymes to come in, make copies of the DNA into RNA. And then we could use that RNA to do computations without disturbing the original DNA.
SOPHIE BUSHWICK: Got it. So, yes, I’d love to hear more about this nanomaterial.
ORLIN VELEV: It can be really a pleasure to participate in this project, because this is a really multidisciplinary investigation. Some time ago, we got together with my colleague Albert, and we were discussing how we can basically use the innovative nanomaterials that we make in my group and we study, in order to manipulate and process DNA.
And we had just come across this new material, which we called soft dendritic colloids. It is a fibrillar material, which is made out of biopolymer and it is branched. It has this hierarchical structure. So you have a thicker branch in the middle, and then thinner and thinner branches, which come to be nanofibers all around. And nanofibers tend to be very sticky in physical perspectives.
The reason gecko legs can run on any surfaces– that is, gecko lizards can run on any surfaces with their legs, is that they have this sticky mats of nanofibers. So it turned out that our nanofibers in the in the new materials that we are making can be very sticky to DNA. Basically, we have a particle of fibrillar nature that can bind DNA. And in this way, we can immobilize the molecule, we can protect it in physical and mechanical sense, and we can even manipulate the whole cluster of DNA that has been collected by using magnetic particles which are also included in the structure.
So basically, while Albert has been providing the software, in a sense, we have been trying to provide a hardware that is going to allow the whole thing to be protected and manipulated.
SOPHIE BUSHWICK: When you describe the polymers as sort of like the tree branch with thinner and thinner pieces coming off it, it makes me picture the DNA sort of tangled up in a forest of trees, but on a very, very tiny scale. Is that an accurate way to think about it?
ORLIN VELEV: Well, that is an interesting analogy. Well, if you think of a DNA as a, let’s say, a delicate biological object, such as a bird. A tree is an ideal way to protect a bird in the sense that it can fly in. The branches would protect it, but then, it can still go out. So basically, we have access to the inside, but we have protection from the inside when the molecule is hosted within this hierarchical structure.
And the other thing that’s important about hierarchical structures of this type is kind of a little bit more scientifically said, they have very high surface to volume ratio. So we use a small amount of material, but we create lots of surface area that is then available for DNA molecules to bind. So we do not use too much material, but we can bind lots of DNA on those particles. And as I mentioned, we can also add magnetic nanoparticles during the formation. So the whole cluster at the end is going to be magnetic.
SOPHIE BUSHWICK: Got it. I mean, does that mean that I could sort of– I could plug a computer monitor into the DNA computer and it would theoretically run?
ALBERT KEUNG: No, that– well, actually, yes. You would need an electronic interface. And the time scales of the operations would be very slow, compared to what you’re used to.
SOPHIE BUSHWICK: How slow?
ALBERT KEUNG: On the order of probably a few hours to enter in a command and then get the result.
SOPHIE BUSHWICK: OK. So what is this whole set up look like? If I’ve got– I’ve got my computer monitor, I’m waiting on the results, but what is the DNA computer part of this setup look like?
ALBERT KEUNG: You probably will always need a electronic computer as an interface. And what it would do is basically act as an intermediary between a very high-density DNA setup and database. It would basically process whatever data that you want from it and then display it so that a human could see it. The setup would look something like microfluidics. So you have either tiny, very thin kind of capillary-like tubing or it could be microfluidics.
That’s what we used in this work. But you could also create microfluidic devices that are like little chips that have very small, narrow channels inside. And you could put different databases within these channels, and flow basically different solutions containing your enzymes, or just water, through these channels in order to to make copies or execute computations, access the data that you want.
You would then flow, say, the RNA copies of the data that you want out of the nanomaterial that’s linked to the DNA. The RNA would come out of that and flow into what we call a sequencer. There are several different technologies. The one that we use is called the Oxford Nanopore. So those RNA molecules would then flow through the nanopore and give off different electrical signals as it passes through that pore, and those signals would correspond to the different letters. And you would get that readout that would get sent to your computer.
SOPHIE BUSHWICK: And I know that DNA is a very compact way of storing information. I mean, each one of our cells has about two meters of it, and then it’s compressed into just about six microns. So how much smaller could we make our computers if they use DNA for some of this data storage and processing?
ALBERT KEUNG: Yeah. So theoretically, if you do what’s called freeze drying of the DNA, meaning you basically evaporate all of the water away and you’re left only with the DNA, it can be very, very compact. You could literally store all of the world’s information, square foot.
ORLIN VELEV: If I can go back to the material aspects of this work, DNA can store lots of information, but it is also a delicate molecule. And it is easy to– it is easy to encode the material, to encode information in DNA, but not that easy to then find the right molecule and pull it out of the rest. So that’s why you also need the materials component, which is how do we protect, immobilize, move around, sort out the DNA.
So that’s what makes this research, hopefully, interesting is that it really has all those informatics aspects, and molecular, and materials, and electronics even.
SOPHIE BUSHWICK: That’s a really good point because we think about data storage– I mean, I know that it doesn’t last forever. A USB drive might only work reliably for a decade or less. So as a computer and as a storage method, how does DNA compare to other forms of data storage? How long can it preserve itself?
ALBERT KEUNG: One of the really main drivers of the DNA storage field has been not only the incredible information density, but the longevity. There’s been fossils that have been discovered that are a million years old, and people have been able to extract DNA from it. I think that that– there’s caveats to that in that. The DNA is degraded, and you aren’t able to access all of the DNA in a pristine condition.
However, very simple storage methods can preserve the DNA for a million years. This is one of the key advantages of molecular storage, theoretically stored DNA for thousands of millions of years at near room temperature or in like a household freezer without having to expend very much energy to do that. In comparison, a lot of electronic media, like you mentioned, a USB stick or a lot of the long-term storage, media-like tape, magnetic tape, these are actually not very stable.
I think we often think about inorganic materials as very robust and hardy, right? But they actually have a lot of defects actually just coming out of the manufacturing plants. And a lot of these defects are engineered around in your devices. And over a few years, radiation that naturally comes from space can degrade your devices, just wear and tear from heat oxidation.
And so even the tape storage that’s used for long term, archival storage of data, those often you need to copy the material every 5 to 10 years onto a new tape reel. And you have to repeat that every every 5 to 10 years.
SOPHIE BUSHWICK: And you’ve said that DNA could revolutionize computing. What do you see this tech being used for?
ALBERT KEUNG: So, I do not really see it as, you know, replacing your laptops or personal computing. But I think that there’s a lot of things that are important for our everyday economy and society that rely on computing in the background that we don’t know about, we don’t see. So things that are happening at data centers, these really large buildings with massive energy land footprints that are executing computations for us.
So even when we do like a Google search flight, where is that computation happening? It’s not actually happening on your computer. It’s off somewhere else. And there’s a lot of these large-scale computations that are really important for industry, for academic, research. And I think DNA could be very powerful for those types of processes, where you need to make very, very complicated and demanding calculations that require a lot of storage, but also parallelized computation.
ORLIN VELEV: Yes. If I may add a little bit of a different angle away from informatics, the ability to store and manipulate and deliver DNA and RNA can also find applications in other areas, such as drug delivery, vaccines, plant treatments. So really, kind of combining materials and informatics, in this case, I mean, really have lots of potential future implications which are still to be understood, probably.
SOPHIE BUSHWICK: And what about the two of you? Where do you see your research heading now?
ALBERT KEUNG: In so many directions. We actually have a couple projects that are still ongoing related to just the properties and DNA as a material itself, as well as other types of materials that the develop group has been pioneering and how that interacts with DNA, whether we can protect it for millennia at room temperature, for example. There’s a lot of fundamental questions as well, just about how these materials work at the nanoscale that we’re also interested in.
ORLIN VELEV: I can say that we have been really very inspired by what we have learned from Albert’s group, in the sense that there are different methods for manipulation of particles on the nanoscale, and sorting out in microfluidic devices. That was mentioned, but we have been interested also in external fields, electrical fields especially. So this has been, really, a very productive collaboration in terms of combining ideas from different areas and finding out interesting new applications for both DNA and nanoscience.
SOPHIE BUSHWICK: Thank you so much for joining us.
ALBERT KEUNG: Thank you so much, Sophie, for having us.
ORLIN VELEV: Thank you.
SOPHIE BUSHWICK: Those were North Carolina State University’s Department of Chemical and Biomolecular Engineering Professors. Dr. Albert Keung and Dr. Orlin Velev.
Copyright © 2024 Science Friday Initiative. All rights reserved. Science Friday transcripts are produced on a tight deadline by 3Play Media. Fidelity to the original aired/published audio or video file might vary, and text might be updated or amended in the future. For the authoritative record of Science Friday’s programming, please visit the original aired/published recording. For terms of use and more information, visit our policies pages at http://www.sciencefriday.com/about/policies/
Sophie Bushwick is senior news editor at New Scientist in New York, New York. Previously, she was a senior editor at Popular Science and technology editor at Scientific American.
Andrea Valeria Diaz Tolivia was a radio production fellow at Science Friday. Her topics of interest include the environment, engineering projects, science policy and any science topic that could make for a great sci-fi plot.