This Damn War I was the senior systems administrator (in fact I was the only IT person, but not the IT "manager" as that would entail a whole new level of paperwork for the client) and had been tasked with moving our central office to an office one kilometre away.
As a joint venture, we were moving into the majority owner's existing office in West Perth and I’d been liaising with the corporate integration team based in Melbourne. To start planning for this, I initially listed all the services that we had coming into our existing office and what services would need to be transferred, with particular care to the internet service.
The installation of the service was arranged and scheduled to happen well before the move so that we could be sure that we would not be left offline. Phone services were identified and communicated to the team working on the phones as this was not going to be with our existing PBX, but through the Telstra TIPT service.
Power for the rack of servers (a pair of Hewlett-Packard blade servers and storage arrays, along with numerous switches and routers) was supplied via two 5kVA UPS with external battery packs. The need for these to be on separate power circuits was discussed and the conclusion reached that there was sufficient capacity on the existing UPS to cascade off the UPS in our new rack.
This turned out to be the fatal error in our planning. So after alerting staff that we would be taking down the head office infrastructure from 4pm on Friday afternoon, we proceeded to document and shut everything down and start unracking equipment. As we were moving both the rack and the equipment, it was decided this would be the easiest way to move the rack.
Professional tip: never have carpet in a comms room - it makes moving racks impossible and is a possible fire risk. We placed all the equipment in one car and a standard APC rack just fit in the back of a 150 series Landcruiser Prado. Mind you, as the driver you need to be very skinny and not have to stop in a hurry. We arrived at the new office, installed the rack and re-racked all the servers, congratulating ourselves on a successful move.
By mid-afternoon on the Saturday, everything was up and running and tested OK. We congratulated ourselves on a successful move and had a day off.
Come Monday morning, all hell broke loose. When staff had come in and started logging into the servers, the load on the servers and the electrical draw increased, resulting in the electrical breaker for the UPS tripping. When the battery ran down, all the equipment had stopped, a fact that ensured I started receiving phone calls from staff as I travelled to work, complaining that nothing was working.
An so began the most frustrating week of work ever. When the breaker was reset, all the servers tried to start up, causing an instant outage again. This time, it also caused problems with the Exchange data store shutting down (crashing) in a dirty state.
To resolve the issue, the existing UPS in our rack were redirected to separate power circuits instead of cascading off the main UPS.
The takeaway: always provision for extra UPS power when moving gear and ensure the presence of separate power circuits for any UPS. Also, if you can, use duplicated UPS services and power supplies running on as much equipment as possible, so that should one UPS drop out, another can take the load. ®
Got war stories of your own you’d like to share? Drop us an email to email@example.com with the subject This Damn War.
Sponsored: Ransomware has gone nuclear