Feeds

Hosts with the mosts: Getting to grips with SLAs for the cloud

Hey baby, I’m your telephone man

Next gen security for virtualised datacentres

When email is down, businesses cease to function. If the email goes down due to a mishandling of the Exchange server, the appropriate sysadmin is found and duly berated.

Finger pointing exercises are less well defined when email stops working because Gmail is down. Again. In this case the sysadmin in question bears no direct responsibility for the issue. His burden lies through the indirect responsibility of the recommendation to engage Google’s business-class email services.

A sysadmin caught in this particular trap can do little. He neither controls the servers in question, nor is there any method of ensuring an appropriate Google sysadmin on the job. Google quite famously doesn’t take phone calls. A calm reminder about how to make use of whatever backups and contingencies exist is all a sysadmin in such a situation can muster. One can only trust that Google will live up to its Service Level Agreement (SLA).

It is perhaps unfair to single out Google for this theoretical exercise; it has proven able to live up to its SLA. It offers a massive array of services with outages so short and infrequent that each one is news. Google has become the poster child for upholding a punishing SLA.

It is also the poster child for “not getting it” regarding customer service. Microsoft earns some points over Google here; though limited, it offers phone and live chat and even Twitter support for many of its services.

Amazon offers yet another approach; you may pay for whichever level of support you feel appropriate. One-on-one online support is available starting with the basic support package. Phone support starts at $400 and goes up from there. Still others hosted providers seem to treat support as nothing more than a public relations requirement.

Trust in me

Regardless of how well executed the technical requirements of an SLA, there is a sense of helplessness experienced by those asked to trust in that SLA. People aren’t very good at bearing statistical uptime in mind when a critical service goes down at an inconvenient time. The quality and type of customer service are an important – though often neglected – consideration to any hosted service SLA.

Such feelings may not be entirely rational, but they are human. People need to feel in control. When something goes wrong, it is simply not enough to fix it quickly. We require reassurance that the problem is acknowledged and being worked on. A timeframe for repairs is vital; downtime costs money and past a certain point backup plans need to be engaged.

Some of the support issues legitimately can be solved through automation. Services dashboards let customers know that an outage is known and being worked on, even in cases where live support is not offered. Google and Microsoft both offer serviceable examples. Google Apps has a status page for select applications. Microsoft’s Windows Live services are similarly monitored. Microsoft’s Azure cloud also has a comprehensive offering.

How these status pages are handled is critical. Consider both Google’s approach to an incident on 2011-03-09 and Microsoft’s approach to an incident on 2011-03-16. In both cases, incidents were handled with professionalism. As soon as the support desk became aware of the incident it was reflected on the status page. Users that knew about the status pages – and checked them – were kept in the loop throughout both outages.

As professional as both this approach is, its real world serviceability has limits. Automated support is completely inadequate when downtime is costing your business thousands – or millions – of dollars an hour.

Hosted cloud services are risky. The right SLA is critical to the success of hosted services in your organisation. Selecting a provider with the right mix of support options is as vital as selecting one that can deliver on their promises of high uptime.

Trevor Pott is a sysadmin for a small-ish company based in Edmonton, Canada.

Gartner critical capabilities for enterprise endpoint backup

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Shoot-em-up: Sony Online Entertainment hit by 'large scale DDoS attack'
Games disrupted as firm struggles to control network
Cutting cancer rates: Data, models and a happy ending?
How surgery might be making cancer prognoses worse
Silicon Valley jolted by magnitude 6.1 quake – its biggest in 25 years
Did the earth move for you at VMworld – oh, OK. It just did. A lot
Forrester says it's time to give up on physical storage arrays
The physical/virtual storage tipping point may just have arrived
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
5 things you didn’t know about cloud backup
IT departments are embracing cloud backup, but there’s a lot you need to know before choosing a service provider. Learn all the critical things you need to know.
Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
Top 8 considerations to enable and simplify mobility
In this whitepaper learn how to successfully add mobile capabilities simply and cost effectively.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?