Scientists Use CRISPR to Store Data in DNA

In a paper published in Nature Chemical Biology, researchers described how they used CRISPR sequences to store data in DNA in a way that could preserve the integrity of the data over generations,

DNA has been the predominant information storage medium for biology and holds great promise as a next-generation high-density data medium in the digital era. Currently, the vast majority of DNA-based data storage approaches rely on in vitro DNA synthesis. As such, there are limited methods to encode digital data into the chromosomes of living cells in a single step. Here, we describe a new electrogenetic framework for direct storage of digital data in living cells. Using an engineered redox-responsive CRISPR adaptation system, we encoded binary data in 3-bit units into CRISPR arrays of bacterial cells by electrical stimulation. We demonstrate multiplex data encoding into barcoded cell populations to yield meaningful information storage and capacity up to 72?bits, which can be maintained over many generations in natural open environments. This work establishes a direct digital-to-biological data storage framework and advances our capacity for information exchange between silicon- and carbon-based entities.

Storing Data in DNA

Interesting article in Science looking at the current state of using DNA to for large-scale data storage. The articles notes that DNA storage would be,

Capable of storing 215 petabytes (215 million gigabytes) in a single gram of DNA, the system could, in principle, store every bit of datum ever recorded by humans in a container about the size and weight of a couple of pickup trucks.

The article goes on to profile Yaniv Erlich and Dina Zielinski who have worked to increase the density at which data can be stored in DNA. After converting a couple data files into binary and then splitting that into digital DNA strands,

They sent these as text files to Twist Bioscience, a San Francisco, California–based startup, which then synthesized the DNA strands. Two weeks later, Erlich and Zielinski received in the mail a vial with a speck of DNA encoding their files. To decode them, the pair used modern DNA sequencing technology. The sequences were fed into a computer, which translated the genetic code back into binary and used the tags to reassemble the six original files. The approach worked so well that the new files contained no errors, they report today in Science. They were also able to make a virtually unlimited number of error-free copies of their files through polymerase chain reaction, a standard DNA copying technique. What’s more, Erlich says, they were able to encode 1.6 bits of data per nucleotide, 60% better than any group had done before and 85% the theoretical limit.

The cost, however, is still prohibitive. According to Science, “it cost $7000 to synthesize the 2 megabytes of data in the files, and another $2000 to read it.”