Excuses, excuses: Furious MPs probe banking TITSUPs*
*Terrible IT Threatens Services, Users, Pound
MPs have stuck a probe in banking IT crises after an "astonishing" number of failures, saying "measly apologies and hollow words" aren't good enough.
The influential Treasury Committee has launched the inquiry off the back of a spate of outages, most notably the TSB meltdown that lasted for almost a week in April.
The committee has repeatedly pointed out that it is vital banks provide reliable, resilient online services, since high street branch closures mean customers are forced to rely on internet banking.
Congrats to Debbie Crosbie: New CEO at IT meltdown bank TSB has unenviable task aheadREAD MORE
"Do you realise the reputational damage this has done, not just to TSB, but to online banking in this country?" chair Nicky Morgan asked TSB's former CEO, Paul Pester, back in May.
"Please don't underestimate the scale and the concern with which bank customers generally are watching this."
Announcing the inquiry today, Morgan said the number of IT failures in recent years is "astonishing" and noted that millions of customers have been affected by the disruptions.
"Measly apologies and hollow words from financial services institutions will not suffice when consumers aren't able to access their own money and face delays in paying bills."
The inquiry – which is taking submissions until 18 January – will assess common causes of operational incidents, the prevalence of single points of failure and legacy tech, and risks associated with integrating banks' IT systems.
Other issues include how outsourcing affects resilience, what an appropriate level of tolerance for operational disruptions should be, and how outages harm customers. In June, TSB admitted that 1,300 customers had been defrauded as a result of its botched IT migration.
The Committee is also likely to consider whether regulators are able to ensure firms are properly protecting against service disruption or have the skills to hold the various actors to account after a serious incident. Such questions come as the UK's Financial Conduct Authority and the Bank of England's Prudential Regulation Authority have made plain that regulation is a possibility.
The committee's inquiry will be of little surprise to those who have watched the MPs become increasingly frustrated as bank after bank suffered outages that prevent customers from using services.
After an outage at Visa in June, and then a spate at Barclays, RBS, TSB, HSBC and Cashplus in September, the committee began writing furious letters to the institutions demanding answers beyond the mealy-mouthed excuses handed out on Twitter.
Responses from Barclays, Cashplus, RBS and Visa have been published today, offering up details on who knew what, when; which systems were affected and why; and how they are improving.
Some are more contrite than others, but all issued apologies and promises to do better next time – we've summarised their responses below.
Barclays: Corrupted messages a 'rare and unexpected' incident
CEO Jes Staley kicked off (PDF) by downplaying the incident as a "partial system disruption" affecting "some" customers. He apologised for the glitch, but leaned on the fact that "no hardware or software can be 100 per cent fail safe".
The incident – during which time the bank would have expected 89,000 logins, or 1.3 per cent of daily digital traffic – was caused by an "extremely rare compatibility issue between two software systems", he said in a more detailed document (PDF).
This corrupted messages being sent between a cheque imaging technology platform and other financial servicing systems, affecting "critical central messaging infrastructure" that communicates with a range of applications and services.
Due to the "unique nature" of the incident, corrupt messages were sent across all four of the separate copies of mission-critical data that should have offered resilience.
Barclays said it was the result of a change implemented the day before, which it claimed had run successfully in pre-testing. The bank added it was working with vendors to figure out why the issue arose after "rigorous testing".
Cashplus: It was FIS wot did it
CEO Rich Wagner was today upfront in admitting that the outage affected 14,375 customers and that it has already paid out more than £64,000 in compensation.
He noted that during the three-day incident, 95 per cent of transactions were processed normally – but said this was "not trying to justify the poor service".
The borkage – which saw intermittent problems get worse over several days – was blamed on third-party payment processor FIS changing a database configuration during planned maintenance on one of its UK data centres.
FIS reportedly deployed without issue in test and initially production environments, but increased load caused performance problems.
NatWest and RBS: Network firewall fail
The shortest of the letters, from Ross McEwan, opened with the standard apology, an acknowledgment of the crucial role banks play in the economy and a promise no customers would be left out of pocket.
The incident was caused by the incorrect implementation of a network firewall rule update. After this was identified, the change was reversed and services restored.
The bank said both its primary and secondary peer review control for all network firewall changes "incorrectly concluded" that the change was valid.
In the short term, the bank is adding an additional check to the process, and in the long term it said migration to a new generation of network infrastructure would increase resilience.
Visa: We've already migrated off Euro processing system!
Visa explained the reasons for the failure that took out payment services across Europe back in June – fingering a "very rare" partial network switch failure in one of its two data centres.
At the time, it said it was planning to migrate its European processing activities to its global processing system, VisaNet, which the letter published today claims was completed in September without a hitch.
It also provided the committee with the findings (PDF) of a review by Ernst & Young on the June incident, saying it had accepted the recommendations in full.
This includes changes to management processes and crisis management protocols, as well as technical work to assess the resilience of VisaNet. ®
Sponsored: Becoming a Pragmatic Security Leader