Original URL: http://www.theregister.co.uk/2012/07/13/o2_outage_cause/
O2 outage outrage blamed on new Ericsson database
Single point of access became single point of failure
O2 has fixed its poorly mobile network, so now everyone can start asking what went wrong and what the company is going to do about it.
O2's press office isn't responding to queries. However, our understanding from various sources is that the day-long outage was caused by the transition of subscribers' details to Ericsson's Centralized User Database, which disappeared during the process leaving handsets unable to authenticate their users.
The CUDB is supposed to consolidate user records, supporting additional applications, and provide a single point of access – but on this occasion it appears that it also provided a single point of failure. With the database unavailable, mobiles and other devices were gradually booted off the network.
O2 started outsourcing its radio network to Ericsson in 2009, with the Swedish tech giant taking responsibility for field maintenance and switching sites. The relationship between the two companies extends into all areas of the business.
The telecoms-kit industry is all but an oligopoly, with NSN, Alcatel Lucent and Ericsson dividing the bigger contracts between them. Huawei is the newcomer to the party, and recently signed a deal with O2 for next-generation kit, but it wasn't involved in this particular screw-up. Other parts of the network come from Cisco and the usual players, but core functionality is specialist enough to keep the industry small.
Now that Ericsson has damaged O2 so badly, one should expect to see these telecoms rivals hanging around the O2 offices in Slough for the next few weeks, publicly saying the network failure could happen to anyone while privately briefing that it wouldn't have happened with their kit.
When it comes to users, it's the biggest customers that O2 will be worried about now. In the aftermath of a nationwide breakdown, most network operators would be terrified that the virtual operators who piggyback on their infrastructure will switch networks, taking hundreds of thousands of customers away instantly.
But O2 owns half of its biggest virtual mobile operator, Tesco Mobile, and all of its next largest, GiffGaff, so they won't be going anywhere.
Ordinary customers will demand compensation, but O2 has no obligation to provide any. Corporates with O2 contracts should negotiate hard when it comes to renewal: 21 hours of downtime is a big stick to beat the salesman with.
GiffGaff chucked £10,000 to charity after it was down for an eight-hour (unrelated) outage last month – this time GiffGaff can legitimately blame O2, but whether the parent will feel similarly obligated to cough some cash, we still don't know.
Ericsson told El Reg in a statement: "As a key supplier we have been working closely with O2 to restore the service to their customers and to identify the cause of the fault." ®