OVH goes TITSUP again while trying to fix its last TITSUP

Attempt to harden network failed, badly, so the call's gone out to Cisco for help

By Simon Sharwood, APAC Editor

Posted in Cloud, 7th December 2017 00:24 GMT

European web hosting outfit OVH has reported its second major outage and Total Inability To Support Usual Performance* in a month and admitted the new outage was caused by its attempts to fix the cause of the last one.

OVH's attributed its November outages to power problems and cable cuts.

But this incident notice filed by CEO and founder Octave Klaba on Wednesday 6 December stated “the problem was related to a software bug on the equipment we use which caused the deletion of the configuration.”

The notice continued: “Since then we have updated the equipment on everything our network. Also to prevent this type of bug from never [sic] again causes a worry [sic] about our DCs, we have decided to divide equipment clusters into 3 on the RBX website. So, if we ever have again this bug, the configuration would only impact 30% traffic.”

What did OVH learn from 24-hour outage? Water and servers do not mix

READ MORE

The company planned to change to that new regime late on 6 December, 2017, European time. But the changeover to the new systems has failed and caused connectivity problems and outages in Europe and beyond.

“During the preparation of the maintenance that was to start at 23:00, the configuration disappeared again at 8:20 pm and all the links were down again!!!!!” Klaba's notice said.

Those are Klaba's exclamation marks, by the way. It's understandable he used so many because the next sentence is: “The database has been deleted while we are using the latest software version. So there is another bug!”

Next step? “We look with Cisco to understand why all the links are not UP while the configuration was delivery to RBX.”

The outage has made for an ugly Status page at OVH, as depicted below from the time of writing.

OVH's Status Page: red lights everywhere. Click here to embiggen

If there's a small piece of upside in this incident, it's that it struck late in the European evening and continued into the small hours, times when traffic is low and some customers may not notice massive impact on their operations.

But there will also be plenty who were impacted, and irritated, and wondering why they give their business to a company that has also experienced flood damage and can't configure routers well enough to avoid this sort of thing.

OVH has promised to send The Register a statement about the incident. ®

* Total Inability To Support Usual Performance = TITSUP

Sign up to our NewsletterGet IT in your inbox daily

26 Comments

More from The Register

What did OVH learn from 24-hour outage? Water and servers do not mix

Coolant leak crashed VNX array at web host's Paris data centre

Fujitsu's Australian cloud suffers storage crash, outage

User tells of significant data loss

Dell sell-off saga gets weird: Subsidiary VMware may buy parent in 'reverse merger'

Buy-out would let Big Mike swerve IPO headaches

Roses are red, violets are blue, VMware's made a new vSphere for you

Version 6.7 should land in Q2, may end support for older CPUs

VMware sticks finger in Meltdown/Spectre dike for virtual appliances

Proper patches under way, but for now - to your command lines, vAdmins!

Where did all that water go? Mars was holding it wrong, say boffins

If there was life on the Red Planet, it hit the rocks – literally

Fujitsu Australia cloud outage leaves lifeguards' members exposed

Cloud operator 'reviewing' outage as Surf Life Saving Australia says it still doesn't have a proper portal

VMware begins new vSphere beta, but not for a big bang upgrade

Virtzilla's turning the continuous integration crank for vSphere on AWS

Outage outed: Bing dinged, Microsoft portal mortal, DuckDuckGo becomes DuckDuckNo

Updated Like it or not, it's back to Google for now

Dell confirms: We're either going public – or VMware's gobbling us (or nothing will happen)

SEC doc follows IPO, reverse-merge rumors