Expired cert... Really? #O2down meltdown shows we should fear bungles and bugs more than hackers

Incompetence is a kind of malware

Pulling the plug

Comment It's a bit of a cliche that "everything's connected", but O2's stunning outage yesterday – chalked up by Swedish kitmaker Ericsson to an expired software certificate – is a reminder of how true that is.

Payment terminals croaked, bus displays went blank. Strangers blinked at each other in the street, like Robinson Crusoe waking up at Westfields. Fearful drivers cancelled their journeys, and took the train instead. We only realise how pervasive machine-to-machine (M2M) mobile data connections are in our lives until they stop working, and it’s only going to become much more pervasive.

It would be prudent of the tech and networking worlds to realise how stupid it is to design systems that don't have an offline fallback, but then their customers tend to believe the "always-on" promises until they go off.

Yes, the data outage is easy to mock (so easy, I've just done it myself in fact). But the next such outage may cause power or utility outages, all for lack of a fallback mode. And it could be even worse. We've been told that the low-latency modes of 5G are required for V2X (vehicle-to-everything), which the car of the near-future will have to help it down the road. The question of why we need V2X isn’t asked very often. We must have it.

rage

Why millions of Brits' mobile phones were knackered on Thursday: An expired Ericsson software certificate

READ MORE

"The very idea that your vehicle will trust the data that (ostensibly) came from someone else's and act on the data in a way that could result in injury is mind-boggling," says Ken Tindell CTO of car security startup Canis Automotive Labs. And when that goes down?

Tindell blames the "MVP" (minimum viable product) mentality that Silicon Valley has introduced. It's harmless enough when Bikini apps fail, but deadly serious now they bring the "fail often" culture to our homes and cars.

Lessons to be learned. Then forgotten...

I actually think the outage is very well timed, provided it gives people pause for thought. Systems and appliances cannot take the cloud or the connection for granted, and must have a fallback mode. Perhaps they need a fallback mode mandated in. Regulations aren't only for bananas.

Then there's the commercial side. Leaving aside the M2M, the number of people affected surprised some: it's 31.2m (at O2's last count). That's a very big number. Several of the most popular mobile virtual network operators (Sky, Tesco, Lyca) rent O2's network and GiffGaff, which is the part of Telefonica that likes to pretend it isn't, uses it too.

Giffgaff gag

Click to enlarge

Now you can't magic a nationwide network out of thin air, and having four mobile network operators, the UK is more fortunate from this perspective than most nations, which have three. So you could mandate that critical services fall over from the dysfunctional network onto one which is still working. Which is a great idea - until you think about it. It risks creating a domino effect, with the likelihood that in areas of patchy coverage, each network fails in turn, until nobody at all has a signal. This had to be patiently explained to the brains trust around David Cameron four years ago, when he demanded action because his calls in the Cotswolds kept failing. Once the fuss died down National Roaming was quietly buried.

The fact is building and operating a nationwide network requires huge capital expenditure, and we have to trust they don't go down. But at least they've got a vast range of network equipment suppliers to choose from, right?

Actually, no. There are only three major network equipment suppliers who matter, and the ever-thoughtful and reflective POTUS would wish there were two. As Donald Trump was conducting trade talks in China, he was arresting the daughter of the founder of one of those three. I am not assured that a choice of two is a great thing for any market.

While the "Spook Industrial Complex" frets about cybersecurity - rightly, of course, for that's what we pay it to do - it draws attention away from the everyday bugs and bungles that cause the outages that inconvenience millions of people. One is hypothetical, but one is real and an increasingly regular occurrence. And this applies across our national infrastructure. I wonder how much end of life kit is open to years-old vulnerabilities? Or how many routers have 123 as a password? More than you think.

It just isn't as sexy. Journalists prefer the hacking angle because they love the thrill of a briefing from a spook. Papers and TV get to use the stock photo of the guy in the hoodie at a computer. (It's always a hoodie). And what could be more dull than a failover regulation or (real) ISO standards compliance? Of course we need to do both, but I bet the next outage will be a bungle, not a backdoor. ®




Biting the hand that feeds IT © 1998–2018