Thunderstruck: Azure Back in Black(out) after High Voltage causes Flick of the Switch
Lightning storm Shook Texas facility All Night Long
Microsoft is blaming bad weather for the massive outage that knocked a number of Azure cloud and Visual Studio Teams services offline Tuesday.
The Windows giant revealed its South Central US facility in Texas was crippled after severe storms and lightning strikes overloaded its cooling equipment, forcing its servers and other machines to shut down. In short, the facility was hit with a power surge and, facing the choice of having either powered-down computers or melted computers, the plug was pulled and the facility was taken offline amid the emergency.
“A severe weather event, including lightning strikes, occurred near one of the South Central US data centers,” Microsoft said of the lightning strike.
"This resulted in a power voltage increase that impacted cooling systems. Automated datacenter procedures to ensure data and hardware integrity went into effect and critical hardware entered a structured power down process.”
We note that heavy storms are rolling into Texas as category-one Hurricane Gordon menaces America's Gulf coast. A real cloud striking a fake cloud, as it were.
The shutdown not only knocked out services, databases, and virtual machines hosted at the Texas facility, but also took out the Azure Active Directory service starting at around 09:30 UTC. This, in turn, led to the problems that plagued Azure service users in Europe, the US, and elsewhere in the world, particularly those running Visual Studio Team Services, for much of the day.
Azure promises to keep your backups safe and snug for up to 10 yearsREAD MORE
“Customers with organizations outside of the South Central US region may also be experiencing impact with their CI/CD workflows, dashboards due to some internal infrastructure dependencies,” the Visual Studio Team Services group said of its part in the outage.
Microsoft now says it is working to get the facility back online and recover as many of the machines as possible, though a number of services remain offline at time of writing.
“Engineers have successfully restored power to the datacenter. Additionally, engineers have recovered a majority of the impacted network devices,” Microsoft said.
“While some services are starting to see signs of recovery, mitigation efforts are still ongoing.”
Redmond plans to provide further updates on the recovery this (US Pacific time) afternoon. ®