Feeds

Lateral thought saves sizzling server

Game, set and crash

Choosing a cloud hosting partner with confidence

D'oh! I learned a long time ago that generating random numbers (really, truly random numbers) is a non-trivial exercise.

However, I completely failed to apply that computer science lesson to the real world of computing and continued to believe that events in the Newtonian world could happen without a cause. Such a belief system is not usually dangerous but, when applied to solving computer problems, it can be a serious disadvantage.

I came to my senses after I had spent a great deal of time trying to track down an intermittent fault on a NetWare server. It crashed. Then it ran fine for days. And then it crashed again. And again. Apparently, at random.

We would get a call from the local supervisor Rosanne whenever it crashed. By the time we got there (it was offsite) the server would reboot as if nothing had happened and run like a dream.

Sometimes it would run for weeks, other times it crashed three days running. The only correlation we could spot was that the crashes were always during the day so in that sense it wasn't random but since days happen seven times a week, every week without fail, it wasn't really a great help in diagnosing the problem and curing it.

And during the day there was absolutely no correlation with load. We concluded that the server was crashing at random and started the process of swapping parts (at random!) to try to cure it.

Then, on one of our frequent visits, Rosanne said jokingly that we really had to fix the problem because it was ruining her social life. The server crashed every time she played tennis with her new boyfriend. By this time we were desperate to find any correlation between the crashes and real life so we rather startled her by resurrecting the Spanish inquisition.

Was she serious? Well, er... not every time but, yeah, her boyfriend had pulled her leg that she was setting off her pager on purpose to avoid losing games and she had realized that it did seem to happen all too frequently. How often did she play? Well, a couple of time a week, maybe; it depended.

On what? How did she decide to play? Well, both she and her boyfriend worked flexitime so whenever the weather was good, they booked a court and played a game. They made up the missing time by working an hour later in the evening.

Ace in the hole

So the server was crashing when the weather was good. OK, how do we define good weather in Scotland where this was all happening? It is good weather if the sun shines. What happens when the sun shines? The sky is bluer, there are fewer clouds, the humidity is probably lower... it gets hotter. Hmm.

Servers don't like heat. Where is the server? Sitting on a bench. In front of a south facing window - this was in the days before server rooms, when air conditioning was provided only for mainframes.

So, Rosanne plays tennis when the weather is good, the sun shines and it's cooking the server. The Newtonian world is back in balance, yin has a yang and effect does have a cause.

I am, I like to think, at least slightly wiser now. I learned from that particular lesson that saying: "It must be random" is another way of saying that I have yet to find the correlation. Worse than that, it's usually a cop out.®

Doh! Is Mark Whitehorn's look at the events, and lessons learned, that served him well during his computing career.

Internet Security Threat Report 2014

More from The Register

next story
NSA SOURCE CODE LEAK: Information slurp tools to appear online
Now you can run your own intelligence agency
Azure TITSUP caused by INFINITE LOOP
Fat fingered geo-block kept Aussies in the dark
Yahoo! blames! MONSTER! email! OUTAGE! on! CUT! CABLE! bungle!
Weekend woe for BT as telco struggles to restore service
Cloud unicorns are extinct so DiData cloud mess was YOUR fault
Applications need to be built to handle TITSUP incidents
Stop the IoT revolution! We need to figure out packet sizes first
Researchers test 802.15.4 and find we know nuh-think! about large scale sensor network ops
Turnbull should spare us all airline-magazine-grade cloud hype
Box-hugger is not a dirty word, Minister. Box-huggers make the cloud WORK
SanDisk vows: We'll have a 16TB SSD WHOPPER by 2016
Flash WORM has a serious use for archived photos and videos
Astro-boffins start opening universe simulation data
Got a supercomputer? Want to simulate a universe? Here you go
Microsoft adds video offering to Office 365. Oh NOES, you'll need Adobe Flash
Lovely presentations... but not on your Flash-hating mobe
prev story

Whitepapers

Designing and building an open ITOA architecture
Learn about a new IT data taxonomy defined by the four data sources of IT visibility: wire, machine, agent, and synthetic data sets.
Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
5 critical considerations for enterprise cloud backup
Key considerations when evaluating cloud backup solutions to ensure adequate protection security and availability of enterprise data.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Managing SSL certificates with ease
The lack of operational efficiencies and compliance pitfalls associated with poor SSL certificate management, and how the right SSL certificate management tool can help.