On-Prem

This article is more than 1 year old

Going strictly hands-off: Managing your data centre from afar

Techniques for saving your sanity, and your job

Mon 30 Mar 2015 // 12:03 UTC

If your core servers – and hence your core applications – live in a data centre, then by definition they're not on your premises.

In many cases they may be hundreds of miles away – in fact, in a previous life, my employer's most distant data centre was six time zones away in the US Midwest.

This means that you don't have the option of wandering into the server room and power-cycling something; instead you need to work hard to make your systems manageable from afar.

Documentation

Absolutely core to the remote management of data centres is accurate, complete and rigorously updated documentation. You can't just nip and have a peek at stuff, and so you need to be able to rely completely on the documentation for information about the system.

Every connection – power, serial, LAN, the lot – needs to be rigorously documented and a regime of capital punishment initiated to deter people from not updating the docs when something changes.

It only takes one undocumented power or LAN change to make your world fall apart when you confidently but inadvertently disable something crucial because the docs differed from reality.

Similarly, document the front and rear panels of all the devices in the cabinets, along with the possible statuses of all the flashing lights and what each means: we'll come to why in a moment.

On the flipside of documentation is the labelling of everything in the data centre cabinets. Unless you're very close in a geographical sense to your data centre, you're likely from time to time to call on the data centre provider's staff to do something for you – install a new LAN connection, or maybe fit a replacement hot-swap power supply when an old one dies.

So give your devices names, and label them on the front and back. Label every cable a few inches from each end (not right at the end – you won't be able to get at the labels to read them).

And this is why I've had you document the front and rear panels: if your server has two hot-swap power supplies and one dies, you need to be absolutely certain to tell the provider's “intelligent hands” person which one to pull out.

And of course because you documented all the LED status options, you can get him or her to double check before pulling: “It's the one on the left, but before you pull please confirm that the light's flashing yellow, as that signifies it's the failed unit.”

Monitoring

Next on the list we have another core aspect of stuff being a long way away: you can't just whack another disk into the box if you run out of space.

There are so many monitoring tools on the market – and so many free ones – that there's no excuse for not monitoring your data centre to death both to check that everything's healthy but also to do capacity planning and usage trending for key resources.

Run up proper monitoring, preferably in a form that doesn't rely on the data centre being fully functional. How can you send an alert that everything went down if the monitoring server's on one of the boxes that went down? Maybe you could even look to one of the many cloud services that offers system monitoring?

Topics

Special Features

Vendor Voice

Resources

On-Prem

Going strictly hands-off: Managing your data centre from afar

Techniques for saving your sanity, and your job

Documentation

Monitoring

More about

More about

Narrower topics

More about

More about

More about

Narrower topics

TIP US OFF

Other stories you might like

AWS must pay $525M to cloud storage patent holder, says jury

HPE sues China's Inspur Group over server patents

Backblaze cloud storage buzzes with added Event Notifications

Reducing the cloud security overhead

Banned Nvidia GPUs sneak into sanction-busting Chinese servers

China's mega-telcos are spending billions on AI servers

More than a third of enterprise datacenters expect to deploy liquid cooling by 2026

Snowmobile, Amazon's truck-powered migration service, reaches the end of the road

Dell shaves months off lead times for GPU-powered AI servers

AI boom is boosting demand even for HDDs, raising prices by up to 20% since Q3

San Francisco's light rail to upgrade from floppy disks

PCIe 7.0 first official draft lands, doubling bandwidth yet again

About Us

Our Websites

Your Privacy