Archive storage comes to Google Cloud: Will it give AWS and Azure the cold shoulder?
Fast retrieval and 'Bucket lock' security, but not the cheapest for cloud storage
Google has opened the freezer on general availability of its Archive class cloud storage, designed for data that is stored for more than a year and accessed less than once every 12 months.
Archive Storage, which was previewed in April last year, is the fourth tier of Google cloud storage options. Note that this is programmatic business storage, not general-purpose cloud storage like Google Drive. The other three tiers are Standard, Nearline and Coldline, differentiated by price, performance and frequency of access.
A distinctive feature of Archive Storage, compared to what is on offer elsewhere, is that it has instant access – Google says in the announcement that it has "millisecond latency". You will not want to use it for day-to-day storage though, since operations are 10 times the price as for standard storage, data retrieval five times more expensive, and minimum storage duration is one year. The API for using Archive Storage is the same as for other storage classes.
Archive Storage is for data stored for reasons such as compliance, research, media backup, and video archives - surveillance data - or as an additional backup. The data is encrypted, has optional geo redundancy, and can be protected against accidental or malicious deletion with a feature called Bucket Lock. Bucket Lock combined with a retention period makes files in effect read-only, which could be handy in this era of ransomware.
How does Google's product compare with similar ones from Amazon Web Services and Microsoft Azure? It is complicated. AWS has six storage tiers, from S3 Standard to S3 Glacier Deep Archive. The two Glacier options have retrieval from "one minute to 12 hours." Therefore this is not an exact equivalent, though the difference may not matter in many scenarios. Azure has just three tiers – Hot, Cool and Archive – though there is also a "Premium" option for Standard which need not concern us here. All three vary their costs according to whether or not you need redundancy across multiple regions. Prices can also vary according to region. Early deletion charges mean you pay for the specified duration even if you do not use it.
|Cold storage||Prices in US$ / month|
|10,000 write operations||0.05|
|10,000 read operations||0.004|
|Early deletion charge:||180 days|
|10,000 write operations||0.1|
|10,000 read operations||5|
|Early deletion charge:||180 days|
|Early deletion charge:||365 days|
This table, note, is an over-simplification. The pricing is complex; operations are broken down more precisely than read and write; the exact features vary; and there may be discounts for reserved storage. Costs for data transfer within your cloud infrastructure may be less. The only way to get a true comparison is to specify your exact requirements (and whether the cloud provider can meet them), and work out the price for your particular case.
This table suggests, though, that GCP is not trying to be the cheapest. The long early deletion period and high data retrieval cost makes it more expensive, but only by a little if you consider that with archive storage it is the monthly storage fee that matters most, since data retrieval should be rare.
What about storage on-premises? This is potentially much cheaper, but the cloud providers offer both resilience and off-site storage which takes effort to replicate with your own systems. All storage is vulnerable to physical decay, fire or other calamity, or mis-configured systems that mean the data stored is not what you hoped it was. ®