How Google taped up its email outage wounds
Is there still a role for the reels ...
Comment Does tape have a role in cloud computing?
Ask cloud evangelists that question and they sit back, purse their lips, and say, "No, of course not ... but ..." The thing is they tend to come from disk storage-biased suppliers or consultancies and are in love with virtualisation, the placing of abstraction layers between server apps and hardware and between server apps and disk storage hardware.
Yet they have to admit tape storage is cheap, long-lasting and reliable, more so than deduplicated disk drive arrays pretending to be tape libraries. The financial numbers go in tape's favour but the emotional attachment to disk products works against it.
Arch cloud IT supplier Google found it necessary to rely on tape archival backup when there was a Gmail outage in late February. Google's Ben Traynor, an engineering VP, blogged about this and talked about software bugs affecting "several copies of the data". In other words, trashed data was "snapshotted", replicated etc, propagating the original fault.
Fortunately for users: "To protect your information from these unusual bugs, we also back it up to tape. Since the tapes are offline, they’re protected from such software bugs."
Well yes, of course, we all know that. There was also a dig at the speed of restoring from tape: "But restoring data from them also takes longer than transferring your requests to another data centre, which is why it's taken us hours to get the email back instead of milliseconds."
The basic story here is that tape archives got cloud evangelist Google out of a hole. Without tapes, never mind the restore speed issue, there would have been nothing to restore, and users would have lost email data.
Tape is the archive backstop for lost or duff data on disk, with a 30-year lifespan. It's also a lot more cost-effective than disk for such use, a point repeated again and again by tape automation vendors.
What do tape automation vendors HP, Quantum and SpectraLogic say about cloud use of tape?
SpectraLogic VP of product management and marketing, Molly Rector, says: "Spectra Logic views cloud providers as customers, not competitors. Tape will be the strong, silent partner to the cloud – very much present and in use, just completely transparent to the end-user. "
What markets in the cloud does she see for tape?
Public Clouds are most likely to be utilised by SMBs (small and medium businesses), primarily for economic reasons. Because of this imperative on public cloud providers to keep costs to a minimum, tape is likely to be the largest storage repository in these offerings because of the significantly lower cost compared to disk.
Hybrid Clouds are an interesting proposition [but] we don’t expect to see mid-market and enterprise customers adopting it. This approach may catch on at the lower end of the market where the financial benefits may outweigh concerns about regulation or security.
Private clouds are just another name for modern internal data centres. Regardless of the effect that server virtualisation, virtualised tier 1 storage and even network virtualisation has on the makeup of a data centre, backup and archiving are still major imperatives, and ones in which tape has an integral role to play.
Rector also provided some colour on the cost advantages of tape versus disk:
At the exabyte level, data deduplication may provide a 90 per cent reduction in total storage, but the annual costs of running 100TB of deduplicated storage is still going to be in the tens to hundreds of thousands of dollars just for heating and cooling annually.
That same exabyte’s worth of data can sit on idle tape cartridges and consume absolutely no power for the tape itself unless the data needs to be accessed, and very little power to maintain, monitor and cool the tape system as a whole.
We then asked Rector if tape should be used for backup or archive. She replied:
Tape and disk have different strengths in terms of speeds, capacities, and access methods, so depending on the configuration, disk or tape may be faster. Restoring a system is typically associated with backup and not archive, and tape is very fast at streaming that data back to the system. Spectra’s view is that tape is the right choice for archiving; and that disk is typically better suited for backup.
With a nod to the Gmail outage, she continued:
Tape also provides a cloud service provider with an added layer of security, as an offline copy of data is the only copy that is 100 per cent safe from a malicious attack [or self-imposed software update data corruption.]
In the light of the Gmail outage, that point should resonate strongly. Last point from Spectra. If customers' cloud data is on standard format tape, then, if they want to change cloud service provider or exit the cloud, they can have the tapes shipped to them. That's just not feasible with disk arrays.
Sponsored: Benefits from the lessons learned in HPC