Feeds

Microsoft's own code should prevent an Azure SSL fail: So what went wrong?

Cloud service fell over despite cert automation in Server 2012

Top three mobile application threats

Sysadmin blog Server 2012 is the Microsoft operating system that, in my opinion, makes cloud computing a reality. As far as I am concerned it is as big a leap over Server 2008 R2 as that OS was over Server 2003. With it you can build anything from a small cluster to a service as big as Microsoft's own Azure platform.

Which is why I am completely baffled as to how it is possible that Azure was knocked offline by last week's SSL cock-up.

Let me start out by saying that I have the utmost sympathy – and respect – for the poor bastards working behind the scenes to fix this particular embarrassing incident. I'm not too proud to admit that I have done the exact same thing; like Microsoft, I've accidentally let a HTTPS certificate lapse more than once.

I could throw up excuses such as the ever infamous "I was too busy". I could even hand-wave at Apache's maddening certificate management (which makes it easy to miss a node) or RapidSSL's long delays in verifying the certs.

I could make those excuses, but I won't; none of them are valid. I screwed up because I was lazy, and any users trying to access an Outlook Web App late at night last Christmas (and the one before) were terribly inconvenienced for nearly six hours. The bit that bothers me about this snafu is that Microsoft doesn't even get to try those excuses. Not only can Microsoft sign its own damned certs, Server 2012 makes this whole process so simple web administrators will weep.

Microsoft has code to save itself from this sort of blunder

One of the features buried inside the release notes for Server 2012 is Centralized SSL Certificate (CSC) management. You can run a farm of up to 10,000 IIS web server nodes off a single CSC server; each of them can be directed to automatically contact the server to receive their certs from a single server that gives you a reasonably simple interface to direct a symphony of re-validation.

Considering everything in Microsoft's new cloudy world is PowerShell scriptable, you can even stagger renewals so that no one certificate expiration can tank everything. Microsoft doesn't have to worry about licensing Microsoft's own kit, so how exactly did this happen?

Even if it was the cryptographic certificate upstream from the end nodes that expired, why wasn't the CSC server auto-renewing from elsewhere? Since Redmond can sign its own certs, then between CSC and Server 2012's more traditional certificate manager you could have a great big circle jerk with servers auto-renewing in an endless frolic of crypto-hedonism.

So let's set this aside for the moment and assume that for whatever reason someone somewhere decided that it was vitally important to manually update a certificate along the chain. What could have prevented them from doing so? Maybe it was the data centre edge blacklist that Office 365 users can't control. Nah; you'd think that the cert guy would have an internal staff list that would tell him where to send the bottle of scotch to make sure that the people who try to send him email actually can.

Still working on the assumption that an expired cert was at fault, last I checked, Microsoft had some money lying around, so if it was getting the certificate verified by an external entity it should have been possible to pay the bill. Laziness? I doubt it. Surely Microsoft pays its systems administrators enough to actually care about their job. It is highly unlikely to be the fault of any one person not pulling the trigger on the update.

That leaves me with two remaining possibilities. The first: Microsoft isn't using its own rather excellent technology to handle these certs. I'm not fully sure of the underpinnings of Azure; does it run on Server 2012? Bing.com does. Even if Azure isn't using off-the-shelf Windows Server, there would be a delicious irony if Microsoft – enthusiastic player of the constant, cacophonous drumbeat of "upgrade for your own good" – had failed to take advantage of technology it invented to solve this exact problem.

I find it hard to buy that Microsoft doesn't have a version of CSC for their Azure infrastructure, leaving me with only one solid hypothesis about Azure's outage. I believe Microsoft is coming face to face with the fact that when pretty much all automation relies on scripting - using PowerShell or otherwise - a simple change to one line of code in one script can topple the mightiest cloud. Even one built on a foundation as solid as Server 2012.

I have a lot of respect for the systems administrators running Azure. That's a big, complicated job with an enormous amount of pressure. Right now, they are probably getting emotionally flayed alive - I won't envy them for the next few weeks. I would, however, like to offer a suggestion to Microsoft - especially the script-all-the-things happy server division. Pick up the phone and call Luke Kanies over at PuppetLabs.

Ask him nicely for an education on why enforced states are better than scripts. Learn from those who have solved the problem of leaving the reputation of their flagship cloud service hanging on a single forgotten semicolon. ®

Top three mobile application threats

More from The Register

next story
OpenBSD founder wants to bin buggy OpenSSL library, launches fork
One Heartbleed vuln was too many for Theo de Raadt
Got Windows 8.1 Update yet? Get ready for YET ANOTHER ONE – rumor
Leaker claims big release due this fall as Microsoft herds us into the CLOUD
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Ubuntu 14.04 LTS: Great changes, but sssh don't mention the...
Why HELLO Amazon! You weren't here last time
Patch iOS, OS X now: PDFs, JPEGs, URLs, web pages can pwn your kit
Plus: iThings and desktops at risk of NEW SSL attack flaw
Next Windows obsolescence panic is 450 days from … NOW!
The clock is ticking louder for Windows Server 2003 R2 users
Batten down the hatches, Ubuntu 14.04 LTS due in TWO DAYS
Admins dab straining server brows in advance of Trusty Tahr's long-term support landing
Red Hat to ship RHEL 7 release candidate with a taste of container tech
Grab 'near-final' version of next Enterprise Linux next week
Apple inaugurates free OS X beta program for world+dog
Prerelease software now open to anyone, not just developers – as long as you keep quiet
prev story

Whitepapers

Mobile application security study
Download this report to see the alarming realities regarding the sheer number of applications vulnerable to attack, as well as the most common and easily addressable vulnerability errors.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.
Combat fraud and increase customer satisfaction
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.