We all fall together. Azure MFA takes a tumble for the second week running

Microsoft and the Terrible, Horrible, No Good, Very Bad Day

Updated In a touching show of solidarity with its Exchange Online cousin, Microsoft’s Azure Multi-Factor Authentication (MFA) service has fallen over and is struggling to get back up. Again.

If Microsoft hasn’t developed an AI bot capable of filling its social media orifices with apologies yet, then it is surely only a matter of time before it does so.

letters stuffed in a mailbox. Photo by SHutterstock

Microsoft suffers the Tuesday shakes as Exchange Online continues to be wobbly for UK users

READ MORE

A Microsoft engineer, fingers doubtless weary from writing up last week’s fiasco, took to the Azure status page to admit that, yes, as of 14:25 UTC today, MFA was having problems. But it's ok – it's only a “subset” of customers. The Windows giant went on to warn that those who had MFA required by policy might experience intermittent issues signing in to Azure resources.

These resources include Azure Active Directory. Can you hear the admins wailing?

MFA is undoubtedly a good thing, since it forces users to adopt two or more ways of authentication beyond just a password. A phone, dongle or biometrics can come into play as well. Assuming the MFA service is actually running, of course.

The issue, which is worldwide, comes hot on the heels of the publication of a root cause analysis into the incident last week, which saw a trio of failures that led to users being unable to access their beloved Office 365 services.

At the time, Microsoft said it would endeavour to prevent a recurrence of the problem by looking at how it handled testing and updates and review ways of containing failures before they kick off.

Hopefully that review didn’t take long, because there is a failure happening right now that sure needs some containment.

In the meantime, some unkind customers have suggested applying the solution that worked last time. You know: turn it off and turn it on again.

We contacted Microsoft to find out what had become of the service and the lessons learned from last week, but have yet receive a response. ®

Updated to add

According to Microsoft, "After a preliminary investigation, engineers found that an earlier DNS issue triggered a large number of sign-in requests to fail, which resulted in backend infrastructure becoming unhealthy."

And yes, the outage was tackled, and systems restored, after switching equipment off and on again: "After the DNS issue was resolved, engineers then focused on cycling the relevant backend services to resolve the congestion issue. They observed a decrease in the failure rate after the reboot cycles."

A full postmortem will be released in the next few days.




Biting the hand that feeds IT © 1998–2019