Feeds

AWS stops some EC2 servers without warning

‘Retirement’ notifications not entirely accurate or reliable

Securing Web Applications Made Simple and Scalable

If you’re thinking about heading to the cloud for über-reliability and an environment in which anything that happens to hardware is someone else’s problem, think again: Amazon Web Services sometimes replaces the hardware virtual servers run on and switches those servers off without elegant or accurate notifications of what’s about to happen.

AWS calls this ‘Instance retirement’ and makes it happen when the physical server an elastic compute cloud (EC2) instance runs on ‘degrades’ and is in danger of experiencing hardware failure. Which is very useful indeed, but also a little worrying as the cloud company does not always retire instances elegantly.

Social marketing analysis firm awe.sm recently blogged about the problem here, describing the symptoms as follows:

“Virtual hardware doesn’t last as long as real hardware. Our average observed lifetime for a virtual machine on EC2 over the last 3 years has been about 200 days. After that, the chances of it being ‘retired’ rise hugely. And Amazon’s ‘retirement’ process is unpredictable: sometime they’ll notify you ten days in advance that a box is going to be shut down; sometimes the retirement notification email arrives 2 hours after the box has already failed.”

A trawl through AWS’ support forums suggests that the company isn’t switching off servers without notifications every day, but threads pop up quite regularly in which users complain about servers disappearing.

One such thread (which we shan’t link to because it includes some personal details), complains that “One of our EC2 instance[s] hung and retired an hour before receiving notification from AWS.”

Such an incident would not be entirely painful if the user in question’s instance used Elastic Block Store (EBS), as users with that arrangement need only stop and restart the instance and it will resume operations on new hardware. Performing that action takes mere minutes, so if the hang didn't interrupt important operations the disruption would be slight.

But users whose instances run an Amazon Machine Image from the instance store have a harder task before them. AWS emails on the topic say “If your instance's root device is an instance store, it will be terminated after the retirement date. We recommend that you launch a replacement instance from your most recent AMI and migrate all necessary data to the replacement instance before this time.” It’s also possible to convert AMI instances to EBS instances, but that’s a bit of a chore, as detailed here.

The need to revert to backups or convert to EBS instances is taking some users by surprise as they don’t have backups, as this thread shows.

Of course those without backups have only themselves to blame. EC2 users are also notified of imminent retirement by the AWS console, so again there’s an element of personal responsibility that needs to be considered here ... although the fact that some of AWS’ retirement notifications seem not to be timely is bothersome.

It’s also worth noting that the retirement process isn’t well-documented – we could find only seven mentions of it in the AWS support database and it doesn’t get a mention in the EC2 FAQ, although the terms and conditions for AWS go out of their way to point out the service can experience interruptions.

None of which means that AWS is placing users in peril. But the fact that cloud servers can be halted without prior notification and in ways that require a fair bit of work to repair is surely something to take into account when considering just what the cloud means for your operations. ®

The Essential Guide to IT Transformation

More from The Register

next story
EU's top data cops to meet Google, Microsoft et al over 'right to be forgotten'
Plan to hammer out 'coherent' guidelines. Good luck chaps!
Manic malware Mayhem spreads through Linux, FreeBSD web servers
And how Google could cripple infection rate in a second
FLAPE – the next BIG THING in storage
Find cold data with flash, transmit it from tape
Seagate chances ARM with NAS boxes for the SOHO crowd
There's an Atom-powered offering, too
Intel teaches Oracle how to become the latest and greatest Xeon Whisperer
E7-8895 v2 chips are best of the bunch, and with firmware-unlocked speed control
Gartner: To the right, to the right – biz sync firms who've won in a box to the right...
Magic quadrant: Top marks for, er, completeness of vision, EMC
prev story

Whitepapers

Top three mobile application threats
Prevent sensitive data leakage over insecure channels or stolen mobile devices.
The Essential Guide to IT Transformation
ServiceNow discusses three IT transformations that can help CIO's automate IT services to transform IT and the enterprise.
Mobile application security vulnerability report
The alarming realities regarding the sheer number of applications vulnerable to attack, and the most common and easily addressable vulnerability errors.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Consolidation: the foundation for IT and business transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.