Data at rest is free, data in motion costs money

14Mar14

Update (14 Mar 2014) Andrew Weir pointed out that I the price is per month not per year – corrected accordingly.

The big news of the last day is that Google dropped its pricing for Drive storage to $9.99 per TB per month. Ex Googler Sam Johnston says ‘So the price of storage is now basically free…’:

samj_storage_free

It’s a good point. Buying a TB of storage in a good old fashioned hard disk will cost me about $43 at today’s prices before I consider putting it in a server, powering it on or any redundancy.

My colleague Ryan points out that the real costs in the data centre come from memory and bandwidth, and I follow up with a point about IOPS:

samj_storage_free_

If I buy a TB on Google Compute Engine (GCE) then I’ll pay $40/month – 4x as much. The reason it’s more expensive is that the GCE storage comes with a reasonably generous 300 IOPS allowance.

  • Having storage is approximately free – lots of TB/$.
  • Using storage costs money – limited IOPS/$
  • Getting stuff onto and off of the service hosting storage also costs money – limited GB/$.

The reasons why cloud storage providers can offer large chunks of space for comparatively little money are twofold:

  1. People don’t use it – the moment I pass 100GB of stuff on all of my Google services I need to pay $9.99 rather than $1.99, so for most people the cost is closer to $99.90 per TB per month e.g. the vast majority of people on the 1TB subscription tier will have just a bit more than 100GB of stuff on the service.
  2. People don’t use it very much – most of the stuff on cloud storage services is there for archive purposes – photos you want to keep forever, files you might just need one day. It’s not worth our time to clean stuff up, so keep everything just in case. The active subset of files in use on a daily basis is tiny – something that the pedallers of hierarchical storage management (HSM) have known for years.

There’s also a practical consideration in terms of using those storage services – it takes a very long time to upload 1TB even over a modern connection. A quick calculation suggests it would take me 7 months driving my (fibre to the cabinet) broadband connection 24×7 to upload a TB of stuff.

The key word in the points above is People. Our capacity as individuals to use large quantities of storage is pretty limited.

Of course it’s different with machines, because my server in the cloud might be used by thousands of people, or it might be moving tons of files around just munging them from one data representation to another, or it might be harvesting data from all over the place. Servers can consume huge quantities of IOPS (without necessarily consuming huge quantities of storage), as I’ve proved to myself a number of times by breaking cloud servers by IO starvation.

For anybody thinking that they can just mount their Google Drive onto their cloud server just try… It kinda works – in that you can copy stuff backwards and forwards, but it kinda doesn’t – the performance sucks.

My cloud servers need their IOPS, but my cloud storage service really doesn’t – and that’s why data at rest is free, and data in motion costs money.



One Response to “Data at rest is free, data in motion costs money”

  1. Comparing the capital cost of a hard disk with the subscription cost of a Google TB is subtle. If I had a TB of archive I wanted to keep at rest then the expected life of the disk on the shelf would be at least 5 years. When you multiply the monthly cost out you have plenty of budget left for extras such as having that disk online (powered, bandwidth, USB caddy).

    Of course, you still probably have to factor your time in for free.

    The largest “motion cost” for average consumers will probably be in their broadband data allowance: the locality of this stuff matters almost as much as the advantaged received by letting Google manage the actual space.


Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.