'The server broke and so did my back on the flight to fix it'
Epic tale of idiot sysadmin and the trek to clean up the mess left behind
On-Call Reg readers are awesome and none more so than JT Smith, who sent us an epic tale of the time he was called out to solve a backup problem.
It started with a chap called “Hubswitch”, who was in charge of technology at a remote site that JT oversaw.
Hubswitch earned his name because he kept using the word to describe any piece of kit he thought might be the problem. His peers back at HQ, JT among them, decided the nickname should stick.
JT says Hubswitch had a habit of “fixing” things he didn't completely understand, which of course meant breaking them. But after dispatching one of his trusted team members, JT was satisfied all should be well on Hubswitch's site.
About five months later, JT tells us, a panicked call from the site manager brought news that the server was down and Hubswitch couldn't get it working.
JT didn't worry too much: Hubswitch may not have been the sharpest pencil in the drawer, but did have the skills to rebuild a server. However, a couple of hours later, JT recalls: “Hubswitch called me next to ask for some 'advice'.”
“I asked him what had happened. Apparently that morning he had come in and found the server off and hit the power button, and it just wouldn't boot. It was an OS X server and it had the ominous blinking disk drive which indicated it couldn't find a boot disk.
"We talked through it a bit and I decided to confirm all of my assumptions and started with the most obvious: 'So, your backups are disconnected to make sure they're protected right?'"
"I don't have backups,” was Hubswitch's reply. “After your assistant left I decided I really needed to know how the server was setup so I redid it. I haven't had time to setup the backups again. So we really need to get this drive working. Do you think you can help me with that?”
JT's next call was to the site manager, who he told to start finding files wherever they could be located because things looked bad.
Firing Hubswitch on the spot was an option, but JT and the site manager decided to do that once the mess was sorted.
But Hubswitch saved them the trouble by walking out, so embarrassed was he by his error.
All of which meant that JT had to get on a plane to sort things out.
Sponsored: Becoming a Pragmatic Security Leader