Zettabytes—that’s 1021 bytes—of data are currently generated every year. All of those cat videos have to be stored somewhere, and DNA is a great storage medium; it has amazing data density and is stable over millennia.
To date, people have encoded information into DNA the same way nature has, by linking the four nucleotide bases comprising DNA—A, T, C, and G—into a particular genetic sequence. Making these sequences is time-consuming and expensive, though, and the longer your sequence, the higher chance there is that errors will creep in.
But DNA has an added layer of information encoded on top of the nucleotide sequence, known as epigenetics. These are chemical modifications to the nucleotides, specifically altering a C when it comes before a G. In cells, these modifications function kind of like stage directions; they can tell the cell when to use a particular DNA sequence without altering the “text” of the sequence itself. A new paper in Nature describes using epigenetics to store information in DNA without needing to synthesize new DNA sequences every time.
Typesetting with DNA
The technique uses a long strand of DNA with a set sequence—the template—and a bunch of shorter, premade DNA molecules that can base pair with specific spots on the template—the bricks. Some of the bricks contain epigenetically modified Cs, and some don’t. When a modified brick base pairs with its designated spots on the template, it acts as a signal for an enzyme to modify that spot on the template DNA strand as well, “printing” the epigenetic information onto it without any new DNA synthesis—kind of like setting movable type.
This works because the modified site (a CG) will base pair with a GC on its opposite strand. But since that opposite strand runs in the other orientation, it will also look like a CG to the enzymes that make the modification.
Loading comments...