Jonah Edwards, Infrastructure and Operations Manager for the Internet Archive, put together a fascinating presentation on the Internet Archive’s infrastructure.
It’s amazing to think of a 200 petabyte archive that is growing 20-24 petabytes each year, and what it takes to keep that all running.
In a paper published in Nature Chemical Biology, researchers described how they used CRISPR sequences to store data in DNA in a way that could preserve the integrity of the data over generations,
DNA has been the predominant information storage medium for biology and holds great promise as a next-generation high-density data medium in the digital era. Currently, the vast majority of DNA-based data storage approaches rely on in vitro DNA synthesis. As such, there are limited methods to encode digital data into the chromosomes of living cells in a single step. Here, we describe a new electrogenetic framework for direct storage of digital data in living cells. Using an engineered redox-responsive CRISPR adaptation system, we encoded binary data in 3-bit units into CRISPR arrays of bacterial cells by electrical stimulation. We demonstrate multiplex data encoding into barcoded cell populations to yield meaningful information storage and capacity up to 72?bits, which can be maintained over many generations in natural open environments. This work establishes a direct digital-to-biological data storage framework and advances our capacity for information exchange between silicon- and carbon-based entities.
Samsung is planning to create SSDs equipped with what it calls “fail-in-place technology” that will protect the drives from traditional failure methods.
Samsung’s FIP technology marks a new milestone in the 60-year history of storage by ensuring that SSDs maintain normal operation even when errors occur at the chip level, enabling a never-dying SSD for the first time in the industry. In the past, failure in just one out of several hundred NAND chips meant having to replace an entire SSD, causing system downtime and additional drive replacement cost. SSDs integrated with Samsung’s FIP software can detect a faulty chip, scan for any damage in data, then relocate the data into working chips. For instance, if a fault is identified in any of the 512 NAND chips inside a 30.72TB SSD, the FIP software would automatically activate error-handling algorithms at the chip level while maintaining the drive’s high, stable performance.
This technology will initially be available primarily on SSDs intended for data centers, but hopefully it will eventually find its way into consumer-level drives.
Back in 2013, Amazon announced its Amazon Glacier storage solution–cloud-based storage that was cheap, but designed for data that would need to be accessed very infrequently.
But even Glacier is expensive for some purposes. For example, I’ve got about 100 terabytes I need to back up, and even at Glacier’s low cost of $4-5/terabyte/month, that would still be ~$500/month. At that price, I might be better off buying a tape drive.
Now, Amazon has announced its Glacier Deep Archive storage solution that is designed to go after use cases like this. At a little over $1/terabyte/month, the costs of storing 100 terabytes in the cloud approaching the cost of tape backup.
There are a few caveats, however. First, it appears that the data stored in Glacier Deep Archive cannot be deleted. I assume that’s Amazon reducing costs by simply not making that feature available.
Second, as with the regular Glacier storage solution, getting data back out of Glacier Deep Archive is likely to be slow and more expensive than storing it. Standard retrieval for data in Glacier is around $12/terabyte. If you need faster retrieval, you can do so by paying more.
I do plan to look closely at Glacier Deep Archive and will likely use it as a sort of backup of last resort. I already have a backup system and process, but $100/month for the volume of data I have is very reasonable for a “if everything else gets screwed up” peace of mind.
It’s interesting to see how quickly the per/gigabyte price for SSDs continues to fall as companies begin introducing bigger and cheaper models.
Back in February 2018, I bought a couple 2TB SSDs for some new laptops for about $500/each. Today, ten months later, those SSDs can be had on Amazon for $290, a 42 percent price drop in less than a year.
Meanwhile, Samsung recently announced consumer level QLC SSDs in 1TB/2TB/4TB capacities that will initially retail for $149.99, $299.99, and $599.99 respectively.
Aside from the relatively low prices, one of the interesting things about the QLC drives is their write endurance,
The 860 QVO, from the box, is given a write endurace rating equivalent to 0.3 Drive Writes Per Day (DWPD), which even for the 1TB means 300GB a day, every day, which goes above and beyond most consumer workloads.
Better drives, larger capacities and cheaper storage prices. What’s not to love?