Original URL: http://www.theregister.co.uk/2013/03/27/storagebod_petascale_archives/

I've got a super free multi-petabyte storage box for you: /dev/null

Tape archives can't grow forever, so try our solution*

By StorageBod

Posted in Storage, 27th March 2013 11:04 GMT

Storage Bod As data volumes increase in all industries and the challenges of information management continue to grow, we look for places to store our hoarded bytes. Inevitably the subject of archiving and tape comes up.

It is the cheapest place to archive data by some way; my calculations give tape a four-year cost of something in the region of five-six times cheaper than the cheapest commercial disk alternative. However tape’s biggest advantage is almost its biggest problem; it is considered to be cheap and hence for some reason no-one factors in the long-term costs.

Archives by their nature live for a long-time; more and more companies are talking about archives that will grow and exist forever. Firms no longer seem to be able to categorise data into separate "keep" and "delete" piles. IT bods face exponential data growth. The industry is forced to tackle generally bad big-data management. And so multi-year, multi-petabyte archives will eventually become the norm for many.

This could spell the death for the tape archive as it stands or it will necessitate some significant changes in both user and vendor behaviour. A ten-year archive will see at least four refreshes of the LTO standard on average; this means that your latest tape technology will not be able to read your oldest tapes. It is also likely that you are looking at some kind of extended maintenance and associated costs for your oldest tape-drives; they will certainly be End of Support Life. Media may be certified for 30 years; drives aren’t.

Migration will become a way of life for these archives and it is this that will be a major challenge for storage teams and anyone maintaining an archive at scale.

It currently takes 88 days to migrate a petabyte of data from LTO5-to-LTO6; this assumes 24-7 operation, no drive issues, no media issues and a pair of drives to migrate the data. You will also be loading about 500 tapes and unloading about 500 tapes. You can cut this time by putting in more drives but your costs will soon start escalate as SAN ports, servers and periphery infrastructure mounts up.

And then all you need is for someone to recall the data whilst you are trying migrate it; 88 days is extremely optimistic.

Of course a petabyte seems an awful lot of data, but archives of a petabyte-plus size are becoming less uncommon. The vendors are pushing the value of data; so no one wants to delete what is a potentially valuable asset. In fact, working out the value of individual datum is extremely hard and hence we tend to place the same value on every byte archived.

So although tape might be the only economical place to store data today but as data volumes grow; it becomes less viable as long-term archive unless it is a write-once, read-never (and I mean never) archive…if that is the case, perhaps in Unix parlance, /dev/null is the only sensible place for your data.

But if you think your data has value or more importantly your C-level management think that your data has value; there’s a serious discussion to be had - before the situation gets out of hand. Just remember, any data migration which takes longer than a year will most likely fail. ®

* Subject to terms and conditions. Your mileage may vary. This is a joke. If you file your bytes into the void, don't come crying our way.