Original URL: http://www.theregister.co.uk/2012/08/01/how_can_banks_stop_it_crashes_happening_again/

RBS must realise it's just an IT biz with a banking licence

Expert drills into what it'll take to prevent another bank technology fiasco

By Anna Leach

Posted in CIO, 1st August 2012 10:02 GMT

Analysis Banks need to start thinking of themselves as IT companies, said David Chan of City University London.

"A senior banking technologist has said to me: 'A retail bank is nothing but an IT company with a banking licence'," Chan told The Reg. "While this may seem extreme, when one looks at the economics of any retail bank, it is clear that this is the case."

But it was IT that tripped up financial institutions this summer: a high-profile system crash left 17 million customers of Royal Bank of Scotland (RBS), including its Natwest and Ulster Bank components, without access to their money for a week in July - three weeks in some cases. Two debit card-related screw-ups happened last week at building society Nationwide and at RBS-owned Natwest.

While Chan considers it unlikely that we'll see another catastrophe on the scale of the RBS meltdown, he does see the recent spate of technical cock-ups as evidence that the cut and cut-some-more approach to costs over the past two years, added to underinvestment during the past 20, is starting to hit customer-facing services. It's a pattern we may be familiar with, he explained:

When Sony got hacked, I imagine the guys in Playstation security had been saying for years [that] 'We need to invest more in this'. Or at Microsoft in the 1990s, I'm sure there were guys saying, 'We have got to spend more in patching holes in Windows', and senior management were telling them to add new features.

And with people shifting banks post screw-up, IT performance is having an impact on customer loyalty for almost the first time in decades.

What RBS really got wrong: its mentality

When RBS CEO Stephen Hester faced a committee of MPs in the aftermath of the banking disaster, he said what The Reg had pieced together at the time - that the root cause of the problem was "a routine software upgrade managed and operated by our team in Edinburgh, which caused the automated batch processing software [CA-7] to malfunction".

The Royal Transaction Processing
Company of Scotland

But Chan, who is the director of his university's Centre for Information Leadership, suggested the bank's catastrophe goes beyond a piece of software or one butter-fingered operative: he zeroed in on organisational mentality as the longterm systemic cause of RBS's woes.

A routine procedure shouldn't kill a bank's core operation for three weeks, and the actions of one employee or team of employees shouldn't affect 17 million customers, he said:

I think the fact that batch scheduling uses CA-7 is irrelevant to the root-cause of this particular failure. People do make mistakes but the processes that were in place and the expertise in place had been diluted. The RBS problem is ultimately symptomatic of a culture within many large organisation where the senior team believe that they can rule by “dictat” without having to understand the consequences of their decisions.

And this is what makes offshoring to Chennai, in south east India, a problem: although there is nothing intrinsically wrong with it, out-sourcing can exacerbate bad management. "There's more of a bureaucratic culture in India," said Chan:

This isn't true of everyone but in general engineers over there are more likely to do what the boss says: "Ok, the boss wants something cut, so we'll cut it." This is opposed to coming back to him and arguing against a decision.

That means that when management makes a bad call, there is less chance the executives will be challenged and more chance their boardroom edicts will be carried out.

"If you hire IT people in a support service function, you get a support service mentality," Chan said.

The cost of the cuts: deskill your workforce at your peril

And it's not just a culture issue: there's a trend towards pigeonholing folk into particular roles within a process, ensuring that they do one or two things well and need not worry about the rest of the machine. It's an approach that's at odds with old-school IT pros who were expected to know the tricks and every turning cog of an entire system. Chan said:

RBS staff had always had a fair number of people working in all functions who had a holistic view of the processing. This not only resided in the architectural teams but also in operational teams. Traditionally, development teams were responsible for delivering a transition plan with roll-back procedures included. Operational staff had enough experience to challenge such plans and ensured that these plans were robust.

One well-placed source close to the banking industry told us: "Systems programming has become a little de-skilled in recent years, since software from IBM and third-party vendors (for example, CA) usually just works. They are black boxes now, so there is little ability to make a real-risk assessment, unless you do expensive things like parallel runs for a few nights to compare and contrast."

Our mole reiterated Chan's point: deskilled bods working to a set of written procedures can work fine 99 per cent of the time, but during that one per cent when things go wrong, they won't know what to do and will exacerbate any delays. If the guy who spots a problem doesn't know who to approach, recovery is stalled, which causes more problems in a moment of crisis. Our source said:

This is where institutional knowledge comes in. No matter how good your written procedures, something which escalates up and down hierarchies of people to be notified is not going to work quickly.

So. How can they stop that meltdown from happening again?

An ex-RBS worker told us that he feared that during the clear-up, the banking giant would plaster on more processes and more "risk management". And that seems to be what Stephen Hester has in mind - that's what he told the MPs, anyway:

The investigation will address the effectiveness of our risk management systems, including the identification of low probability/high impact events and their mitigation. It will also assess contingency planning and business resilience i.e. whether other systems within RBS Group are at risk of similar incidents.

But risk management at RBS is already unwieldy, and may in fact have contributed to the crisis, the former employee argued: "If you want to focus on the real problem that caused this it is the overly restrictive change control, as counter-intuitive as that may seem."

Our contact said in a move unrelated to the massive cock-up in July, RBS "management added another two layers of change control" - which are effective another rulebook of procedures to ensure updates to a system are formally agreed and rolled out as planned.

"I like to think I've been around the block, and I can safely say that the change management in RBS is the most complex I have ever come across - so much so I believe it is counter-productive," the source said.

That dissatisfaction with bundles of red tape was echoed by two other RBS workers who spoke to The Reg. Of course change management is an essential part of developing and maintaining an IT system, but an unwieldy rules designed by people not fully aware of the technical hurdles nor with an appreciation that the system is based on "quirky" mainframes of yore, combined with lower-skilled employees and bosses who weren't listening, seems to have created a situation where a problem, when it occurred, ballooned out of control.

Can they spare some change?

A spokesman for RBS told The Reg in response to our findings: "We're carrying out a full and detailed investigation into the causes of the incident. We plan to share the findings of this and will be happy to comment further once this is published."

Banks will need to rethink the position of IT in their businesses, said Chan: "They can get it right, it's when they apply the squeeze in the wrong place that we get these problems."

And in the case of overhauling banking systems it may take someone with bottle, a lot of money and some serious longterm thinking: as a banking expert suggested, it would be like changing all four engines on a airplane mid-flight. ®