Feeds

Software bug contributed to blackout

No, nothing to do with Blaster, or Microsoft

  • alert
  • submit to reddit

Gartner critical capabilities for enterprise endpoint backup

A previously-unknown software flaw in a widely-deployed General Electric energy management system contributed to the devastating scope of the August 14th northeastern U.S. blackout, industry officials revealed this week.

The bug in GE Energy's XA/21 system was discovered in an intensive code audit conducted by GE and a contractor in the weeks following the blackout, according to FirstEnergy Corp., the Ohio utility where investigators say the blackout began. "It had never evidenced itself until that day," said spokesman Ralph DiNicola. "This fault was so deeply embedded, it took them weeks of pouring through millions of lines of code and data to find it."

The flaw was responsible for the alarm system failure at FirstEnergy's Akron, Ohio control center that was noted in a November report from the U.S.-Canadian task force investigating the blackout. The report blamed the then-unexplained computer failure for retarding FirstEnergy's ability to respond to events that lead to the outage, when quick action might have limited the blackout's spread.

"Power system operators rely heavily on audible and on-screen alarms, plus alarm logs, to reveal any significant changes in their system's conditions," the report noted. FirstEnergy's operators "were working under a significant handicap without these tools. However, they were in further jeopardy because they did not know that they were operating without alarms, so that they did not realize that system conditions were changing."

The cascading blackout eventually cut off electricity to 50 million people in eight states and Canada.

The blackout occurred at a time when the Blaster computer worm was wreaking havoc across the Internet. The timing triggered some speculation that the virus may have played a role in the outage -- a theory that gained credence after SecurityFocus reported that two systems at a nuclear power plant operated by FirstEnergy had been impacted by the Slammer worm earlier in the year.

Instead, the XA/21 bug was triggered by a unique combination of events and alarm conditions on the equipment it was monitoring, DiNicola said. When a backup server kicked-in, it also failed, unable to handle the accumulation of unprocessed events that had queued up since the main system's failure. Because the system failed silently, FirstEnergy's operators were unaware for over an hour that they were looking at outdated information on the status of their portion of the power grid, according to the November report.

The root cause of the outage was linked to a variety of factors, including FirstEnergy's failure to trim back trees encroaching on high-voltage power lines. FirstEnergy says its problems were some of many issues destabilizing power flow in the northeast that day, and that its role in the outage is overstated in the interim report.

On Tuesday, the North American Electric Reliability Council (NERC), the industry group responsible for preventing blackouts in the U.S. and Canada, approved a raft of directives to utility companies aimed at preventing a recurrence of the outage. One of them gives FirstEnergy a June 30th deadline to install any known patches for its XA/21 system.

FirstEnergy says it already patched the blackout bug last fall, when GE made a fix available, and is in the process of replacing the XA/21 with a competing system -- a changeover that was planned before the blackout.

NERC spokesperson Ellen Vancko said the organization would release a more comprehensive list of recommendations next month that would likely instruct all U.S. and Canadian electric companies using GE's XA/21 system to install the patch.

"That blackout report will go into much greater detail and will more broadly address the entire industry, whereas this particular report addressed the specific actors involved in the blackout, as well as some specific actions NERC had to take," Vancko said.

GE Energy declined repeated requests for comment on the bug.

Copyright © 2004, 0

Secure remote control for conventional and virtual desktops

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Oracle reveals 32-core, 10 BEEELLION-transistor SPARC M7
New chip scales to 1024 cores, 8192 threads 64 TB RAM, at speeds over 3.6GHz
US regulators OK sale of IBM's x86 server biz to Lenovo
Now all that remains is for gov't offices to ban the boxes
Object storage bods Exablox: RAID is dead, baby. RAID is dead
Bring your own disks to its object appliances
Nimble's latest mutants GORGE themselves on unlucky forerunners
Crossing Sandy Bridges without stopping for breath
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Top 10 endpoint backup mistakes
Avoid the ten endpoint backup mistakes to ensure that your critical corporate data is protected and end user productivity is improved.
Top 8 considerations to enable and simplify mobility
In this whitepaper learn how to successfully add mobile capabilities simply and cost effectively.
Rethinking backup and recovery in the modern data center
Combining intelligence, operational analytics, and automation to enable efficient, data-driven IT organizations using the HP ABR approach.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.