Feeds

How UK air traffic control system was caught asleep on the job

We reveal the touchy culprit behind major NATS glitch

Boost IT visibility and business value

A big outage that struck Britain's air traffic control system on Saturday was due to a technical fault with a touch screen interface provided by Frequentis, The Register has learned.

On Saturday 7 December, during the run-up to one of the busiest times of the year for the UK's airports, controllers at NATS (National Air Traffic Services) operations room in Swanwick noticed that their system had suddenly stopped working.

It quickly became clear that a major problem was unfolding that caused delays for thousands of passengers on flights into and out of Blighty's airspace over the weekend.

By midday on a typical Saturday, NATS would normally expect to be handling around 2,000 flights. But on the Saturday just gone, it was forced to reduce that load by 20 per cent, while its engineers rushed to resolve the technical cockup.

NATS - which bills itself as a "public private partnership" between its own staff (holding 5 per cent) seven major airlines (holding 42 per cent), operator LHR Airports Ltd (4 per cent) and the UK government (holding a 49 per cent "golden share") - initially, and rather vaguely, said the flaw was connected to an internal telephone system that is used by air traffic controllers.

Naturally, El Reg sought more technical details about what had gone wrong.

"The outage on Saturday was caused by a problem with a Frequentis system that enables our controllers to talk to other parts of the operation," a spokesman at NATS said.

"It uses a touch screen interface that automatically loads all the contacts - around NATS and in other agencies involved in the air navigation network - that a controller will need for the particular piece of airspace that they’re controlling at that time.

"It therefore ensures they can always immediately reach the person they need to speak to and will reconfigure itself with settings specific to the sector that the controller is responsible for when they log in for their shift."

But during Saturday's routine shift change, the system – which has been used by NATS for 11 years – collapsed, forcing the controllers to ground aircraft while engineers attempted to fix the error.

It's understood that the touchscreen telephone system failed to configure correctly so that new positions could be opened to split the extra sectors needed for daytime airspace control.

Delays were reported at airports including London, Cardiff, Edinburgh, Glasgow and Dublin. NATS said at the time that the glitch had not compromised passenger safety, but some questioned why contingency didn't fully kick in when the system failed.

NATS said on Saturday:

The technical and operational contingency measures we have had in place all day have enabled us to deliver more than 80 per cent of our normal operation. The reduction in capacity has had a disproportionate effect on southern England because it is extremely complex and busy airspace and we sincerely regret inconvenience to our airline customers and their passengers.

To be clear, this is a very complex and sophisticated system with more than a million lines of software. This is not simply internal telephones, it is the system that controllers use to speak to other ATC agencies both in the UK and Europe and is the biggest system of its kind in Europe.

It added that it had worked closely with Frequentis to get the system up and running. But by Monday morning, following a weekend of political pressure about the outage, NATS boss Richard Deakin admitted that an inquiry into the resilience of the UK airspace was needed.

“We are keen to do all we can at NATS to ensure the aviation industry has a full understanding of the capability that is in place in the UK and to take any further steps our customers and regulators decide are necessary to help avoid a repeat of last Saturday’s problems," he said.

Deakin added that the error took 14 hours to resolve and claimed that NATS eventually "delivered over 90 per cent of an extremely busy schedule of flights during the day".

It was the first time such a serious technical flaw had occurred since the system was installed in 2002, he said.

But we can't help but agree with exasperated folk stranded at airports over the weekend who - quite reasonably - asked why such a failure could have happened in the first place with a critical system. Redundancy, much? ®

Boost IT visibility and business value

More from The Register

next story
HIDDEN packet sniffer spy tech in MILLIONS of iPhones, iPads – expert
Don't panic though – Apple's backdoor is not wide open to all, guru tells us
NO MORE ALL CAPS and other pleasures of Visual Studio 14
Unpicking a packed preview that breaks down ASP.NET
Captain Kirk sets phaser to SLAUGHTER after trying new Facebook app
William Shatner less-than-impressed by Zuck's celebrity-only app
Mozilla fixes CRITICAL security holes in Firefox, urges v31 upgrade
Misc memory hazards 'could be exploited' - and guess what, one's a Javascript vuln
Apple fanbois SCREAM as update BRICKS their Macbook Airs
Ragegasm spills over as firmware upgrade kills machines
Cheer up, Nokia fans. It can start making mobes again in 18 months
The real winner of the Nokia sale is *drumroll* ... Nokia
EU dons gloves, pokes Google's deals with Android mobe makers
El Reg cops a squint at investigatory letters
Chrome browser has been DRAINING PC batteries for YEARS
Google is only now fixing ancient, energy-sapping bug
prev story

Whitepapers

Top three mobile application threats
Prevent sensitive data leakage over insecure channels or stolen mobile devices.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Top 8 considerations to enable and simplify mobility
In this whitepaper learn how to successfully add mobile capabilities simply and cost effectively.
Application security programs and practises
Follow a few strategies and your organization can gain the full benefits of open source and the cloud without compromising the security of your applications.
The Essential Guide to IT Transformation
ServiceNow discusses three IT transformations that can help CIO's automate IT services to transform IT and the enterprise.