BOFH: One double espresso from meltdown
Total Component Fatigue
Episode 16 BOFH 2004
A man could go mad in this business.
One minute you’re hanging off the front of a mainframe shouting about how you’re king of the world - the wave of technology heralding an install which would make even the most hardened geek weep just from the ingenuity, the next moment you’ve got a SCSI card in your hand, not able to understand how, when you put it into a specific machine, it fails to see the devices connected to it.
You plug the card into another box, no problems, the devices make themselves known as they should. You plug it back into the machine it’s supposed to work on, nothing.
When hardware decides to misbehave, it really decides to misbehave.
Unlike the precocious child who will taunt you mercilessly, knowing just how to report the beating they deserve to their school teacher in a manner that will have you in Police custody before lunchtime, hardware is sneaky.
When hardware decides to misbehave, it starts out small. One tiny device doesn’t work properly, but everything else is working exactly as it should. You wander in completely unprepared, believing it's a simple dud disk or a loose cable – because lets face it, that’s what the statistics would suggest.
In today’s case the cable is fine, and replacing the disk has no effect..
I mentally toss up the possibility that it’s an addressing problem – something to do with that particular address - and change the address of the drive and probe the SCSI bus.
Another disk disappears.
I take a quick break to clear my rage and grab another coffee while I'm at it.
While I’m there, I realize that the new test address I chose must have conflicted with the address of the newly missing drive.
Then again, maybe not…
“What’s the matter?” the PFY asks, blundering into a situation that could escalate out of control at a moment’s notice if my temper doesn’t remain in check…
I fill him in on the sordid details, the Boss asking me to get the server up, me lightly saying that it would be up in half an hour, tops, and then the myriad of hardware upsets till now.
“So where are you at?”
“A hardware wizard has popped up on the desktop asking if maybe I want to remove some SCSI devices. But it only has a 'Yes' and 'No' box”
“What other buttons were you expecting?” he asks, voice laced liberally with sarcasm.
“The button saying ‘F-ck off. If I’d wanted a f-ing hardware wizard to read my f-ing mind, I would have configured the f-ing thing in the first f-ing place. Only I didn’t get an option to NOT install the f-ing hardware wizard, did I? NO, because someone at Operating System Central thought that everying f-ing one would want a f-ing hardware wizard to make inane suggestions’,” I say.
“Ah THAT button,” the PFY says. “Say, how many double espressos have you had?”
“Three or Four. Why?”
“Well, I just noticed that you were a little – just a little, mind - testy, and maybe it’s time to take a break…”
“Yes, only I’m on a bit of a time budget with this box. Besides, the Espresso figure was only from this afternoon.”
“And how many this morning?”
“About ten I guess.”
“So you’re just taking your caffeine level past the medical definition of ‘stimulant’ into the ‘poisons’ category?”
“Whatever,” I blurt distractedly. “What’s coming up on the monitor now?” I ask holding the cable in a semi-angled position.
“Yep, I can see them 5 disks, addresses 1,2,3,4 and 5.”
“Bingo – it’s the socket in the cabinet!” I say triumphantly. “One of the connectors in the plug must have moved slightly, probably because the cable’s been bent around the place a bit, putting a lot of stress on the socket and forcing open a contact!”
“So how would you see ANY disks on the Bus?” the PFY asks
“Simple” I say smugly, knowing my experience in this particular field is far superior to the PFY’s own. “The pin concerned is one of the addressing pins.”
“And so how come you had another drive disappear?”
“The physical stress which caused one pin on the connector to fail has most likely caused another to become intermittent.”
So simple when you know how. I put all the disks into a new box and throw out both the box and the old cable, just to be on the safe side.
“Up she comes!” the PFY says, powering up the disk box and rebooting the machine.
“And?” I ask, pushing in front of him.
. . .
“No disks found!” he gasps.
. . .
“So how’s that server going?" the Boss asks, wandering in after what he believes to be a safe interval.
“It’s a hardware problem.” the PFY says “uh…. >flick< >flick< Transient Component Fatigue.”
“Really?” the Boss asks. “I’ve never seen that. Where’s the machine?”
“Well, there’s a bit of it over there in the corner, a bit of it under the desk, and some of it on the table.”
“Bloody Hell! It looks like it’s been hit with a sledgehammer!”
“Yes, TCF is particularly nasty,” I add, helping myself to another coffee. “Also known and TOTAL Component Fatigue. The box basically just falls to pieces.”
“Amazing! Although I did wonder if it might have been that batch of cheap cables we bought a couple of months back.”
“Cheap cables?” I gasp, as the blood rushes to my head and things start to go a little red…
. . .
"You people seen your Boss?" the Head of IT asks
"Went home sick." the PFY says "A touch of TCF."
"TCF," the Head chuckles. "You know back when I was an apprentice, that used to mean something had been hit with a sl... Oh. So who's going to be on the interview panel?"
Give him his due - for a computer crusty, he's a quick learner... ®
Sponsored: Hyper-scale data management