Issue
At 12:20 p.m. today, December 10, 2010, power was interrupted and then quickly restored in the Bio Ag Engineering Building equipment room. This affected all systems and services supported in that facility. Systems included the following: three email servers, UMCal, the U of M homepage, telephone service for the Saint Paul and printing services area, the Data Network, Storage systems, X.500 servers, authentication, and other backup systems. Some services were not affected, as redundancy capabilities were being utilized.
Cause
A semi annual testing of the fire control system in the OIT Equipment room in the Bio Ag Eng building resulted in a failure of the system that caused a six-minute loss of power, as the emergency power off circuit was inadvertently energized as part of the test. The root cause at this point is human factors.
Time Line of Events
- 12:20 p.m. Power was lost
- 12:26 p.m. Power was restored
- 12:35 p.m. PBX and Data Network services restored
- 12:45 p.m. Remaining services and applications were being recovered by systems administrators
After power was restored, impacted systems and services returned to normal functioning one at a time, as their system administrators brought them back online.
Follow Up Actions
- Review procedures for the testing of the system
- Review change control process
- Evaluate the fire control system for failures
- Review efficiency of service restoration
- Review outage communication issues