Feeds

Microsoft's own code should prevent an Azure SSL fail: So what went wrong?

Cloud service fell over despite cert automation in Server 2012

SANS - Survey on application security programs

Sysadmin blog Server 2012 is the Microsoft operating system that, in my opinion, makes cloud computing a reality. As far as I am concerned it is as big a leap over Server 2008 R2 as that OS was over Server 2003. With it you can build anything from a small cluster to a service as big as Microsoft's own Azure platform.

Which is why I am completely baffled as to how it is possible that Azure was knocked offline by last week's SSL cock-up.

Let me start out by saying that I have the utmost sympathy – and respect – for the poor bastards working behind the scenes to fix this particular embarrassing incident. I'm not too proud to admit that I have done the exact same thing; like Microsoft, I've accidentally let a HTTPS certificate lapse more than once.

I could throw up excuses such as the ever infamous "I was too busy". I could even hand-wave at Apache's maddening certificate management (which makes it easy to miss a node) or RapidSSL's long delays in verifying the certs.

I could make those excuses, but I won't; none of them are valid. I screwed up because I was lazy, and any users trying to access an Outlook Web App late at night last Christmas (and the one before) were terribly inconvenienced for nearly six hours. The bit that bothers me about this snafu is that Microsoft doesn't even get to try those excuses. Not only can Microsoft sign its own damned certs, Server 2012 makes this whole process so simple web administrators will weep.

Microsoft has code to save itself from this sort of blunder

One of the features buried inside the release notes for Server 2012 is Centralized SSL Certificate (CSC) management. You can run a farm of up to 10,000 IIS web server nodes off a single CSC server; each of them can be directed to automatically contact the server to receive their certs from a single server that gives you a reasonably simple interface to direct a symphony of re-validation.

Considering everything in Microsoft's new cloudy world is PowerShell scriptable, you can even stagger renewals so that no one certificate expiration can tank everything. Microsoft doesn't have to worry about licensing Microsoft's own kit, so how exactly did this happen?

Even if it was the cryptographic certificate upstream from the end nodes that expired, why wasn't the CSC server auto-renewing from elsewhere? Since Redmond can sign its own certs, then between CSC and Server 2012's more traditional certificate manager you could have a great big circle jerk with servers auto-renewing in an endless frolic of crypto-hedonism.

So let's set this aside for the moment and assume that for whatever reason someone somewhere decided that it was vitally important to manually update a certificate along the chain. What could have prevented them from doing so? Maybe it was the data centre edge blacklist that Office 365 users can't control. Nah; you'd think that the cert guy would have an internal staff list that would tell him where to send the bottle of scotch to make sure that the people who try to send him email actually can.

Still working on the assumption that an expired cert was at fault, last I checked, Microsoft had some money lying around, so if it was getting the certificate verified by an external entity it should have been possible to pay the bill. Laziness? I doubt it. Surely Microsoft pays its systems administrators enough to actually care about their job. It is highly unlikely to be the fault of any one person not pulling the trigger on the update.

That leaves me with two remaining possibilities. The first: Microsoft isn't using its own rather excellent technology to handle these certs. I'm not fully sure of the underpinnings of Azure; does it run on Server 2012? Bing.com does. Even if Azure isn't using off-the-shelf Windows Server, there would be a delicious irony if Microsoft – enthusiastic player of the constant, cacophonous drumbeat of "upgrade for your own good" – had failed to take advantage of technology it invented to solve this exact problem.

I find it hard to buy that Microsoft doesn't have a version of CSC for their Azure infrastructure, leaving me with only one solid hypothesis about Azure's outage. I believe Microsoft is coming face to face with the fact that when pretty much all automation relies on scripting - using PowerShell or otherwise - a simple change to one line of code in one script can topple the mightiest cloud. Even one built on a foundation as solid as Server 2012.

I have a lot of respect for the systems administrators running Azure. That's a big, complicated job with an enormous amount of pressure. Right now, they are probably getting emotionally flayed alive - I won't envy them for the next few weeks. I would, however, like to offer a suggestion to Microsoft - especially the script-all-the-things happy server division. Pick up the phone and call Luke Kanies over at PuppetLabs.

Ask him nicely for an education on why enforced states are better than scripts. Learn from those who have solved the problem of leaving the reputation of their flagship cloud service hanging on a single forgotten semicolon. ®

3 Big data security analytics techniques

More from The Register

next story
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Oh no, Joe: WinPhone users already griping over 8.1 mega-update
Hang on. Which bit of Developer Preview don't you understand?
Microsoft lobs pre-release Windows Phone 8.1 at devs who dare
App makers can load it before anyone else, but if they do they're stuck with it
Half of Twitter's 'active users' are SILENT STALKERS
Nearly 50% have NEVER tweeted a word
Internet-of-stuff startup dumps NoSQL for ... SQL?
NoSQL taste great at first but lacks proper nutrients, says startup cloud whiz
IRS boss on XP migration: 'Classic fix the airplane while you're flying it attempt'
Plus: Condoleezza Rice at Dropbox 'maybe she can find ... weapons of mass destruction'
Ditch the sync, paddle in the Streem: Upstart offers syncless sharing
Upload, delete and carry on sharing afterwards?
New Facebook phone app allows you to stalk your mates
Nearby Friends feature goes live in a few weeks
Microsoft TIER SMEAR changes app prices whether devs ask or not
Some go up, some go down, Redmond goes silent
prev story

Whitepapers

Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Top three mobile application threats
Learn about three of the top mobile application security threats facing businesses today and recommendations on how to mitigate the risk.
Combat fraud and increase customer satisfaction
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.