IT Failures In The Great US Blackout
A report issued by a panel of US Government and Power Industry officials has placed the blame for the largest power outage in North American history primarily on computer and human failures, writes Robin Bloor of Bloor Research. FirstEnergy of Ohio and the Midwest Independent Transmission System Operator, the regional agency with responsibility for overseeing FirstEnergy are roundly criticized.
According to the report, the cascade of failures were caused in part by the failure of FirstEnergy to maintain transmission lines by trimming trees near the power lines. The problem began when an active power line was shorted to earth by a tree. When power lines are heavily active, the wires heat up and sag which may be what happened. However this should not have cascaded into a wider crisis.
As the crisis unfolded, a computer program that should have set alarms off in FirstEnergy's control room failed. Consequently the computer system itself failed and then the back-up/disaster recovery system failed. Consequently, operators in the control room had no clear idea of what was happening. According to the report, FirstEnergy's computer maintenance staff failed to tell the control room of the failure for over an hour. FirstEnergy disputes, this saying that the control room staff actually informed the computer staff of the failure. Meanwhile Midwest I.S.O. was having trouble with its "state estimator", a computer program that reports on whether the electricity grid is in trouble. According to the report, a technician turned the program off, tried to fix it, forgot to turn it back on and then went to lunch.
There is no mention in the report of whether any of these failures were caused or contributed to by the MSBlast worm/virus which was active at the time that the blackout occurred. However it is unlikely that it would be mentioned. Computer systems within the North American energy grid are supposed to be secure and a failure of security is, after all, just another failure of a system. There has been some suggestion on the Web that MSBlast was a contributory factor or even the prime cause of the blackout. The report does say that conditions on the grid were not abnormal when the cascade of failure began.
In any event it is clear that computer system failure was the heart of the problem and given the estimated billions of dollars costs to businesses from the blackout, it seems certain that IT audit and compliance will now be strongly enforced. Quite rightly so.