Did you bet the farm on Amazon's cloud? Time to wean yourself off
Dual-provider strategies are the future
Comment Oracle is making hay over last weekend's mega six-hour Amazon Web Services (AWS) cloud outage. "You get what you pay for," tweeted Oracle's Phil Dunn, with the caveat that all views are his and don't necessarily reflect those of Oracle. But you get the point.
Yes, Amazon's been left with egg on its face and rivals will be exploiting the giant's stumble. There's oeuf, too, on the plate of some of the web's most hallowed – in Wall St and Silicon Valley circles at least – names.
Netflix, Tinder, Airbnb, and IMDb were all down or sputtering.
If anybody is venerated it's Netflix, arguably AWS's most high-profile customer – a reference customer, no less, served up on the AWS site as an example of just how far you can go if you stake absolutely everything on AWS.
Without AWS, it's doubtful Netflix would exist as we know it. Netflix would have had to continue investing in its own data center build program and the rocket scientist brains to build the elastic magic. If it had, perhaps Netflix would now be Amazon, selling its spare capacity and expertise to others. Instead, it outsourced it to Amazon.
In another time, people would have cautioned strongly against relying on a single supplier for your critical IT needs. On the web, that's thrown out the window.
But doesn't Netflix learn? It moved to AWS following a staggering outage at its own data center in 2008 and when faced with the prospect of huge growth in its business. Netflix decided it was best to place its faith in the pros. And yet Netflix still went down with AWS in a big way in April 2011. Now Netflix is reported to be closing its last own-operated data center in favor of AWS.
But hang on. Amazon isn't the only one who should hang its head in shame, here. Actually there's plenty of egg for other faces in this game – Microsoft with Azure and Office 365, Salesforce, Google – all down over the last few years for periods ranging from a few hours to entire days. To them it's nothing – a mere statistical rounding error in their pledge to maintain 99.999 per cent uptime. But to those on the sharp end, it's lost business.
Netflix garners incredulous headlines just because couch potatoes must do the unthinkable and change channels or go outside. But to the broader mass, to thousands of businesses, it means literally not being able to do business: no ERP to manage production or suppliers, no CRM to run sales or talk to customers, no email to talk to colleagues. Your only option is to twiddle your thumbs between hitting refresh on the status page. That statistical rounding error suddenly looks big when you're up close to it.
Customers have been handing this infrastructure over to providers of public clouds, having convinced themselves they are the companies who know best. Uptime and servers is what they do, so they can run this stuff better than you.
Which makes it more confounding when the experts manage to screw up planned maintenance.
Or, as in the case of AWS at the weekend, Amazon failed to read the growing evidence of the popularity of its DynamoDB NoSQL database service. Demand for Global Secondary Indexes put too much of a strain on the metadata servers, forcing systems to stall. Worse, Amazon hadn't anticipated this could be a problem, so its monitoring service wasn't set up properly to fully observe this as a failure.