Colin Percival on Flaws in Jungle Disk’s Security

Colin Percival has an in-depth look at some security issues with Jungle Disk.

A lot of people, including me, have recommended Jungle Disk to people because the cloud-based backup system encrypts your files before it uploads them to Amazon’s S3 service (as opposed to something like Dropbox which encrypts them after the files are uploaded — meaning Dropbox has the key to unencrypt your files if it wants to or is ordered to by a legal authority).

The problem that Percival points out is that they way Jungle Disk handles that client-side encryption is weak to the point where there is a clear path for attackers to follow in either corrupting data stored on the system; replacing data stored on the system; or, ultimately, cracking your password and having free reign.

I used Jungle Disk for a couple years to as a cloud-based backup of my main data hard drive. I used a 20 character passphrase. As Percival notes in a handy chart he provides, a 10 character strong password would take about 95,000 years to crack using current techniques and a $1,000 off-the-shelf laptop. A $10,000 GPU-based password cracking box is going to reduce that to 95 years, and the CIA’s going to be able to put together a system that could rip through that in 2 years. Given that, why not emulate Alfred E. Neuman — what, me worry? But, as Percival writes,

Now, maybe you don’t have any data stored which Joe Cracker would be willing to spend 10 hours decrypting. Maybe you trust Amazon and Rackspace’s internal procedures and security measures to ensure that nobody — either breaking in from outside, or working for those companies — will have access to your “encrypted” data. Depending on who you are and what data you have stored (your credit card numbers? bank statements? how about last year’s income tax return, complete with your national tax ID number?) you might be justified in such trust. But I would say that this is profoundly missing the point: With good cryptography, you wouldn’t need to trust them.

Anyone who doubts the level of computing resources that an individual or small group of people would be willing to throw at a complex problem based entirely on speculation about the value of doing so need only take a look at some of the crazy rigs and setups being built to do nothing but mine Bitcoins. And, of course, the cost of cracking passwords is only declining with every day that passes.

Again, the best solution with cloud backup or sync in general is for the user to encrypt the files before uploading them. Unfortunately, this increases security but at the cost of convenience (and maximizing convenience is apparently why Jungle Disk has these potential issues in the first place).

I Canceled My Jungle Disk Account

The other day I finally canceled my Jungle Disk account. I had canceled my Amazon S3 account awhile ago, but forgot to also cancel the Jungle Disk account.

The main reason was price. The backup set of my personal documents is > 500gb, so using Amazon S3 was costing about $75/month. I can buy a 750 gb hard drive on Amazon for $70 and a 1 TB one for $85 if my data starts growing even faster.

So my new backup procedure is each month I buy a new hard drive and do a complete backup of my data drive. Then I do daily incremental backups for the month. At the end of the month I take that drive and store it offsite, buy a new drive and repeat.

On a quarterly basis I also take the latest backup and make a copy on BD-R so I’m not completely dependent on magnetic media for backup.

Jungle Disk to Support Cloud Files

Jungle Disk started life as a way to backup local files to Amazon’s S3 — I use it to backup the 300gb or so of personal data on my local system and it works great. The company behind the software was recently acquired by Rackspace so in its next release, Jungle Disk will add support for Rackspace Cloud Files,

Since Jungle Disk was started our plan has been to support multiple online storage providers and give users a choice of where they want to store their data. However until recently there haven’t been any viable alternatives to Amazon S3. We’re pleased to announce that in the next release of Jungle Disk (2.6), we’ll be adding support for Rackspace Cloud Files as an option alongside Amazon S3. Like Amazon S3, Cloud Files is a distributed, replicated, Internet-scale storage service. Rackspace is the world leader in hosting and operates data centers across the US as well as in Europe and Asia.

We’re also excited to announce the pricing for Cloud Files with Jungle Disk. Cloud Files storage will cost $0.15 per gigabyte per month with no additional charges for requests or bandwidth in either direction. You only pay for the storage you use. We expect that this simplified pricing along with Rackspace’s reputation for quality and service will make Cloud Files a great option for many users.

They’re quick to point out this means they’re supporting Cloud Files in addition to Amazon S3. In the comments section, one of the developers notes that with the upcoming release, users will be able to backup to both services simultaneously. At some point, however, Jungle Disk will support some sort of replication — backup to S3 or Cloud Files and then data gets replicated between the two without having to double bandwidth usage. Nice.

As for the pricing, it makes sense for Cloud Files to drop the bandwidth charges that Amazon S3 charges since, for most backup situations, the typical user isn’t going to have a lot of bandwidth usage after the initial upload. Making it cheap to get things stored and those monthly charges going is smart business. (S3, however, gets used for a lot of purposes beyond backup, so it might not make sense for them).

Jungle Disk 2.0

Jungle Disk recently announced its 2.0 release. If you’re looking for a secure, online data backup system, this is hands down the best consumer-level one. It is also the best for price/reliability, since it uses Amazon’s S3 storage which is 10 cents/gigabyte for data uploads and 15 cents/gigabyte/month for storage. There are cheaper options out there that will promise unlimited data backup for very small monthly charges, but their reliability and long-term fortunes are, at best, suspect.

With Amazon and Jungle Disk, the only thing limiting online backups is the anemic upstream bandwidth that most of us have (I feel lucky to have 2 megabits/second up). Even at 2mb/s, that’s a long time to upload the hundreds of gigs of data I need to back up.

Jungle Disk Is An Excellent Off-Site Backup Solution

Since data loss is never a good thing, I have a fairly robust backup system in place that involves backing up my most important data daily to external hard drives and then weekly to optical media. But being a bit paranoid, I’ve also always wanted some sort of way to back up my data to an online service so I could have a more robust, off-site backup solution as well.

There are a lot of companies in the online backup business these days, such as Mozy. But I have two sets of concerns about these services. First, as some users have complained, sites like Mozy are often not so great when it comes time to actually retrieve the data you’ve backed up. Second, Mozy and others use the sort of model that web hosts use — charging very low monthly prices betting that most users will only us a fraction of the service which will offset the folks who have lots of data to back up. Mozy, for example, advertises unlimited backups for $4.95/month.

I can see using those sorts of services for small backup sets, but I have about 300gb of data to backup.

Since October, I’ve been using Jungle Disk for on-line backup. Jungle Disk is basically just a backup front end for Amazon’s S3 storage utility. So all of my files are actually residing on Amazon’s servers, and the Jungle Disk application takes care of uploading all of my files and keeping my backup up-to-date.

The application itself costs $20. There is an optional Jungle Disk Plus service which basically uploads your files to a separate server before transferring to Amazon which allows things that S3 doesn’t support yet, such as large file resumes or block-level updates. Thats $1/month.

Storage charges, bandwidth and other charges are set and billed by Amazon. Amazon currently charges 15 cents/month/gigabyte for storage and 10 cents per gigabyte of bandwidth used in transferring files. So that 300gb of data I’ve currently got sitting on their servers is going to cost me $45/month. I consider that cheap given the peace of mind it gives me, but your mileage may vary.

All files are encrypted using 256 bit AES. I created a mind boggling long passphrase. The Jungle Disk Plus service does offer some optional features such as web-based access to your files that, IMO, reduce the level of security too much so I simply don’t use them.

One of the big issues, at least in the United States, is upload speeds since most of us do not have anywhere near the bandwidth leaving our homes that we do for downloads. My cable company recently offered 10mbs down/1mbs up, and I signed up for that before starting to backup all my files using Jungle Disk.

Connecting directly to Amazon’s S3, I typically got 720kbs up — or about 324 megabytes per hour. After Jungle Disk launched their Jungle Disk Plus offering, most of the time my connection maxed out so I was getting 990-1024kbs up or about 461 megabytes per hour. When you’ve got 300gb of data to upload, that’s still a very long upload. In all, it took about two months for my backup job to finally upload everything. Now, Jungle Disk simply checks at 2 a.m. to see what new/changed files have been added and takes care of those automatically.

There are a lot of things I really like about Jungle Disk, but probably the biggest feature as opposed to all of the other services I’ve seen out there is how easy it is to retrieve files. Forget using some special software or going through some convoluted process. Jungle Disk simply maps your S3 storage as a network drive. From Windows Explorer I can navigate through my stored files, find the one I’m looking for, and simply copy and paste to retrieve it. The software does come with a more robust feature for retrieving large amounts of data, say in the case of a catastrophic local data loss, but it is nice to simply be able to retrieve data through a standard file manager interface.

Jungle Disk also does a nice job of archiving previous versions of changed files and deleted files. Archiving older versions of files is completely optional, however, and I suspect many people will disable it given the potential costs.

Overall, my experience with Jungle Disk/Amazon S3 has been outstanding. The only thing keeping from uploading all 20TB of data I’ve got archived in one form or another is a) bandwidth and b) price. But over time, the bandwidth available to the home is only going to increase, while storage costs are only going to decline. For now, this is a great solution for off-site backup of data that would, at least in my case, be impossible to recreate.

Jungle Disk — Online Backups through Amazon’s S3

I am very meticulous about the way I back up my data. I run daily diff backups, followed by weekly complete backups of the 200gb or so of data I consider absolutely critical (i.e., e-mail, pictures, World of Warcraft combat logs, etc.) I then bun that data to writable DVDs which I then neatly file in my basement, with a couple of copies kept at various locations other than my house in case of a catastrophic.

Still, I’d like even more redundancy and have recently been playing around with the Jungle Disk betas.

Jungle Disk is software for backing up important data online. It is a front-end program for Amazon’s S3 online storage service.

There are a couple of advantages that Jungle Disk has over other online backup systems.

The first is that it is very cheap. Amazon S3 charges just 20 cents per gigabyte to upload and then just 15 cents per gigabyte per month for storage. So if I upload my 200gb of data this month, it will cost me $40 for the bandwidth to transfer it and $30 to store it. Assuming I don’t make any changes, it will only cost me $30 per month to store the 200gb. If my hard drive crashed and I needed to download everything, again I’d be looking at $40 to transfer it all back to my computer.

The second thing is that, while others have noted, Amazon pretty much makes no promises about availability in order to limit its liability, data on S3 is stored across multiple servers at multiple locations. Nothing is certain, but I’m betting Amazon S3’s reliability is significantly better than what I can achieve with my Zip backups burned to DVD.

Also, the Jungle Disk application promises to have a lot of key features necessary to make an online backup application useful, such as preserving modification and other file data on S3 to make syncing data from the local drive to the online system viable.

The latest beta also adds an option to encrypt files with 256 bit AES as they are uploaded. For me, encryption is the single most important element to any online backup system — if I can’t be reasonably certain that my files are going to be safe from prying eyes in the event a server is compromised, count me out. 256 bit AES combined with a strong password more than satisfies my concerns.

So what, if any, downsides does something like Jungle Disk have. Hmm..ever try to upload 200gb worth of data over even a fast cable connection? Even at consistent 100kbs upload speeds, I’d be looking at roughly 555 hours to upload it all. Ugh. Once Jungle Disk is out of beta and has had a few months for the more security conscious to critique it, I’ll probably upload my data through the high speed connection at the university I attend rather than spend months doing so through the cable connection I’ve got.