Sysadmin shut down server, it went ‘Clunk!’ but the app kept running
So what did our reader turn off? And what was it running? Oh dear …
Who, me? Hello? Anyone there? We understand that plenty of you in the northern hemisphere might not bother this week. For those of you who are still working, welcome to another instalment of “Who, me?”, The Register’s confessional column in which readers reveal their worst mistakes.
This week meet “Rick” who told us that in the mid-2000s he worked for a local government “and had responsibility for their Solaris Estate.”
In those halcyon days we did not patch servers unless there was a pressing business need, however, we had an issue with our payroll system and a kernel patch was indicated as being a possible fix” Rick recounted.
“I duly negotiated some down time to do so and started early one morning during one of our agreed ‘at risk’ periods.”
Rick told us that he followed instructions and “shut down to single user mode as Sun recommended and applied the relevant patch - not even the whole patch cluster!”
Software changed the world, then died on the first of the monthREAD MORE
“After much longer than I expected the
patchadd command exited and I did a reboot.”
“Being a bit less experienced than I am now I did a proper shutdown rather than doing the simple halt command recommended by the grizzled contractor I had working with me.” Again, Rick worried because the server “took ages to go down”. As the console “showed all the disks unmounted I decided to risk a hard reset.”
“I duly pushed and held the power switch for five seconds until there was a distinct clonk type noise.”
But even though the server had gone down, Rick’s console session was still running. And so was the payroll app he’d just patched.
The local government authority’s housing app? Not so much.
“It turns out i had chosen the wrong button on the two servers next to each other and shut down the housing system instead of payroll, which was already down.”
How did Rick get out of this mess?
“I hastily switched the KVM to the other server and watched as it did an
fsck on all the disk volumes before starting Oracle up fine. Payroll had by then managed to start its post and booted fine as well.”
Rick felt duty-bound to report his mistake, but happily “the application lead who just laughed. He was a very tolerant type didn't take it any further.”
Rick’s story has a mixed ending: he kept the job for another eight years before being outsourced into redundancy. So off he went, older, wiser and having lived the lesson he imparted to “Who, me?” as ‘always look before doing anything irrevocable!”
Have you turned off the wrong thing? Tell us what it was and just how wrong the results were by clicking here to write to Who, me? Good’uns stand a decent chance of popping up here on a future Monday. ®