Feeds

Software bug contributed to blackout

No, nothing to do with Blaster, or Microsoft

  • alert
  • submit to reddit

Top 5 reasons to deploy VMware with Tegile

A previously-unknown software flaw in a widely-deployed General Electric energy management system contributed to the devastating scope of the August 14th northeastern U.S. blackout, industry officials revealed this week.

The bug in GE Energy's XA/21 system was discovered in an intensive code audit conducted by GE and a contractor in the weeks following the blackout, according to FirstEnergy Corp., the Ohio utility where investigators say the blackout began. "It had never evidenced itself until that day," said spokesman Ralph DiNicola. "This fault was so deeply embedded, it took them weeks of pouring through millions of lines of code and data to find it."

The flaw was responsible for the alarm system failure at FirstEnergy's Akron, Ohio control center that was noted in a November report from the U.S.-Canadian task force investigating the blackout. The report blamed the then-unexplained computer failure for retarding FirstEnergy's ability to respond to events that lead to the outage, when quick action might have limited the blackout's spread.

"Power system operators rely heavily on audible and on-screen alarms, plus alarm logs, to reveal any significant changes in their system's conditions," the report noted. FirstEnergy's operators "were working under a significant handicap without these tools. However, they were in further jeopardy because they did not know that they were operating without alarms, so that they did not realize that system conditions were changing."

The cascading blackout eventually cut off electricity to 50 million people in eight states and Canada.

The blackout occurred at a time when the Blaster computer worm was wreaking havoc across the Internet. The timing triggered some speculation that the virus may have played a role in the outage -- a theory that gained credence after SecurityFocus reported that two systems at a nuclear power plant operated by FirstEnergy had been impacted by the Slammer worm earlier in the year.

Instead, the XA/21 bug was triggered by a unique combination of events and alarm conditions on the equipment it was monitoring, DiNicola said. When a backup server kicked-in, it also failed, unable to handle the accumulation of unprocessed events that had queued up since the main system's failure. Because the system failed silently, FirstEnergy's operators were unaware for over an hour that they were looking at outdated information on the status of their portion of the power grid, according to the November report.

The root cause of the outage was linked to a variety of factors, including FirstEnergy's failure to trim back trees encroaching on high-voltage power lines. FirstEnergy says its problems were some of many issues destabilizing power flow in the northeast that day, and that its role in the outage is overstated in the interim report.

On Tuesday, the North American Electric Reliability Council (NERC), the industry group responsible for preventing blackouts in the U.S. and Canada, approved a raft of directives to utility companies aimed at preventing a recurrence of the outage. One of them gives FirstEnergy a June 30th deadline to install any known patches for its XA/21 system.

FirstEnergy says it already patched the blackout bug last fall, when GE made a fix available, and is in the process of replacing the XA/21 with a competing system -- a changeover that was planned before the blackout.

NERC spokesperson Ellen Vancko said the organization would release a more comprehensive list of recommendations next month that would likely instruct all U.S. and Canadian electric companies using GE's XA/21 system to install the patch.

"That blackout report will go into much greater detail and will more broadly address the entire industry, whereas this particular report addressed the specific actors involved in the blackout, as well as some specific actions NERC had to take," Vancko said.

GE Energy declined repeated requests for comment on the bug.

Copyright © 2004, 0

Choosing a cloud hosting partner with confidence

More from The Register

next story
Just don't blame Bono! Apple iTunes music sales PLUMMET
Cupertino revenue hit by cheapo downloads, says report
The DRUGSTORES DON'T WORK, CVS makes IT WORSE ... for Apple Pay
Goog Wallet apparently also spurned in NFC lockdown
Hey - who wants 4.8 TERABYTES almost AS FAST AS MEMORY?
China's Memblaze says they've got it in PCIe. Yow
Cray-cray Met Office spaffs £97m on VERY AVERAGE HPC box
Only 250th most powerful in the world? Bring back Michael Fish
IBM, backing away from hardware? NEVER!
Don't be so sure, so-surers
Microsoft brings the CLOUD that GOES ON FOREVER
Sky's the limit with unrestricted space in the cloud
'ANYTHING BUT STABLE' Netflix suffers BIG Europe-wide outage
Friday night LIVE? Nope. The only thing streaming are tears down my face
Google roolz! Nest buys Revolv, KILLS new sales of home hub
Take my temperature, I'm feeling a little bit dizzy
prev story

Whitepapers

Cloud and hybrid-cloud data protection for VMware
Learn how quick and easy it is to configure backups and perform restores for VMware environments.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.
Protecting against web application threats using SSL
SSL encryption can protect server‐to‐server communications, client devices, cloud resources, and other endpoints in order to help prevent the risk of data loss and losing customer trust.