BOFH: Uninterruptible patsy supply
'We are having a special this week on proton charging and storage of the beast'
Episode 9 "What the fuck just happened?" the Boss garbles, crashing around Mission Control like a madman after dashing down two flights of stairs from the 4th floor boardroom.
"Uh.... UPS failure," the PFY says calmly, glancing up from his monitor briefly.
"Well aren't you going to do anything about it?"
"I am," he responds. "I have to shut down everything on the working UPS units so that I can synchronise and start everything up again from scratch. If I just bypass the broken UPS there'll be all sorts of problems."
Which, on the Pinocchio scale of lies, is a sixteen-foot piece of dowel with some nostrils cut in the end. The only true thing the PFY said is that there is a UPS failure...
"What sort of problems?" the Boss snaps back.
"Well the server-client systems will be all out of whack for a start but the replication of data is bound to fall over - then there's the electrical problems - phase imbalances, hysteresis loops, the possibility of brown-outs and spikes, not to mention the power factor correction that we'd need to put in place."
Utter bollocks. None of the crap is going to happen and the PFY knows it. In actual fact, all that's happened is a cheap UPS in a comms cupboard two floors above us has failed, and instead of failing into bypass – like any reputable UPS that you paid more than three shiny beads for – it just turned off.
All because the Boss, in his wisdom, decided that instead of "wasting" all that money upgrading the UPS feed from the server room to the comms risers for the increased POE switch demands, he'd simply order some standalone UPS units for every comms room in the building.
And so it was that a box of 20 3KVA UPS units arrived late last week.
That's right, a box, not a pallet. That's the first concern. Your average 3KVA UPS weighs enough to have the courier driver dousing himself in Brut 33 before attempting to haul it out of the back of his van, but the whole boxlot of these babies was easily wheeled into Mission Control on a sack barrow. They've probably got the life expectancy of a box of chocolates at a Weightwatchers weigh-in.
One of my personal warning signs on gear is when I get a brand name the same as the device – ie, with this UPS brand UPS from UPS Ltd, I'm fairly sure I'm in trouble. Perusing the copious documentation (a sheet of badly photocopied paper in poorly worded English warning me about supplying the connector of the utility in the reversal format), I note the complete lack of company name, model number or service information.
This should be good.
Where your average UPS would have bar graphs to indicate battery capacity and load, this one appears to have a bulb behind a piece of green plastic with boxes crudely stencilled on it.
The supply transiting operator (power switch) had just the sort of tactile response you'd expect from a thin piece of plastic breaking under finger pressure as the device fired up (without the fire, surprisingly).
And they were installed.
It was only later that the PFY that discovered that if you liven up the network monitoring port from a PoE switch, the thing would overheat and crap out in about five minutes...
It's probably a design feature.
"Well what are you going to do meantime? We're trying to give a presentation to the board! It's very important!"
"I know it is," The PFY says. "You told me that three days ago and so I've made all the infrastructure you need super-reliable by putting UPS units inline for all the gear. There's one in the ceiling powering the datashow, one at the distribution switch, one on the fibre transceiver and even one on the fridge that's keeping the drinks cool - although according to nagios that one stopped responding late last night."
"Which one is broken now?"
"When you say now do you mean now or 10 minutes ago when you were in the boardroom? >clickety<" the PFY asks.
"I... 10 minutes ago."
"That would be the UPS on the distribution switch."
"And... what.. about now?" the Boss asks.
"Ah, well, the Datashow one stopped responding about a minute ago."
"Damn it!" the Boss snaps.
"Everything on the 6th floor is working OK if you wanted to move everyone up there?" the PFY suggests. "You could put get them to meet you up there while I put your presentation on a USB stick."
"Yes! Yes, let's do that."
The Boss makes a quick call to the 4th floor boardroom while the PFY drags some mundane PowerPoint files onto a USB drive.
"Right, 6th floor!" the Boss says, dashing for the door.
"Oh there's no hurry," the PFY says.
"No hurry? Why?"
"It looks like the UPS unit on the lift controls has stopped responding so it will have stopped. Looks like it's in between 4 and 5."
"WHY THE F*** DID YOU PUT A UPS UNIT ON THE BLOODY LIFT CONTROLS!!!?!?"
"Because the lift is critical infrastructure. You wouldn't want the lift stopping between floors because of a panel failure and trapping someone in there. Especially between 4&5, because the 5th floor door has the broken emergency access lever so they'd have to put a ladder down from 6 to get the board members out."
"Can't you put it in bypass?"
"Yes, well, I tried that on the one that failed yesterday but the unit just started smoking and an orange Ghostbusters light came on. Besides, it's on the roof of the lift car and that's stuck between 4&5 and I'd need a ladder to get to it..."
"Ffffff...!" the Boss blurts again, dashing to the door Mission Control
"So the lift really is stuck?" I ask.
"Oh yeah, It's as stuck and the automatic door release on our d... >crash< ..oor," the PFY says, and the Boss slumps to the floor.
"Yeeeeeessss," the PFY continues, firing up the console port of the basement distribution switch. "We're about 12 power inline static max commands away from a complete building shutdown."
"Time is awasting!" I yell, noticing the impending approach of lager o'clock.
Sponsored: DevOps and continuous delivery