Feeds

What's wrong with network monitoring tools? Where do I start...

That red screen? It's just embarassment

Choosing a cloud hosting partner with confidence

Opinion For as long as I can remember I've worked in an environment where there's a screen on the wall showing the status of the company's systems. Or actually, in one case, showing the status of the company's systems unless there was a test match on.

From time to time that information's been useful. Unfortunately, most of the time we've known that there's a problem because half a dozen users have called to raise tickets – the screens haven't necessarily updated in time, and when they have I've had to correlate in my head the impact of the fact that I've just been told that port 12 on switch 3 has gone down.

I've seen dozens of monitoring packages, and they've all been hideously inadequate. Some have been hideously expensive alongside their hideous inadequacy. So why is this? Why does nobody write monitoring packages that actually monitor stuff and tell you what you need to know when you need to know it?

Dodgy protocols

To be fair to monitoring software vendors, they're off to a bad start because the tools available to them are simply appalling.

SNMP (the Simple Network Management Protocol – though frankly there's nothing simple about it) is unwieldy and clunky to use, but we're stuck with it because its longevity has made it ubiquitous. Let's face it, nobody with any sense is about to try to produce an alternative because the barriers to entry into the market are insurmountable.

WMI (Windows Management Instrumentation) is actually very good, but of course it's a Microsoft-only concept so you're stuck with using it only on your Windows estate. Finally you have Syslog... well, you can give a simple priority to each type of alert but the content is largely unstructured and so the usefulness is limited.

Protocol-driven software

The next problem is that many monitoring engines are written by people who understand the protocols but have never really had to monitor anything in real life. So it's all oriented around comparing CPU usage with thresholds, alerting when a switch interface has gone down, and so on.

I've yet to use a monitoring tool that looks like the first step in its development was to send a bunch of analysts to interview network managers and say: “OK, what do you want to be able to do?”

Or if they have, they've gone back to the developers who've said: “Sorry guys, SNMP can't do that, we'll just have to make the dashboard prettier and hope people won't notice it's the same as before.”

So what would the analysts find? Let's imagine, then, that I'm an infrastructure manager and one of the aforementioned analysts descends on me for a couple of hours. What would I be saying I want? Well, here are my top 10.

1. Wildlife camera feature

The camera crews that follow Sir David Attenborough around are these days blessed with cameras that are constantly recording – the last few seconds/minutes of footage are retained and overwritten in a loop. When something interesting happens they hit the “Record” button and the last few seconds/minutes are committed to storage. This means they don't have to have the trigger finger of John Wayne on speed. I want that for my core network ports: when I have a problem, the traffic I care about is what has flowed for the past five, 10, 15 minutes so I want to retain it for a sensible amount of time.

2. Filter by device

If a switch lights up red on the monitoring screen, I want to click on it and pop up the alerts and Syslog entries that relate to it. If a port lights up I want to see that data filtered for that port.

3. Muppet detector

I want the network monitoring package to tell me that the end-to-end connection between a virtual server and the backup server is inefficient because one of the eight or 10 LAN ports the traffic is traversing hasn't got Jumbo Frames turned on.

4. Which way?

I want to see (visually and legibly) the path used by traffic between two endpoints. That means understanding what the load balancer is doing, figuring out which of the physical nodes in a Virtual Router Redundancy Protocol group is carrying the traffic, and so on. And when you've done it, show me the step-by-step operation of the application traffic so I can see where the delays are (and do it at application level, please, so that I can see that, say, the network is fast but the app is being killed by DNS timeouts).

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
'Kim Kardashian snaps naked selfies with a BLACKBERRY'. *Twitterati gasps*
More alleged private, nude celeb pics appear online
Wanna keep your data for 1,000 YEARS? No? Hard luck, HDS wants you to anyway
Combine Blu-ray and M-DISC and you get this monster
US boffins demo 'twisted radio' mux
OAM takes wireless signals to 32 Gbps
Google+ GOING, GOING ... ? Newbie Gmailers no longer forced into mandatory ID slurp
Mountain View distances itself from lame 'network thingy'
Apple flops out 2FA for iCloud in bid to stop future nude selfie leaks
Millions of 4chan users howl with laughter as Cupertino slams stable door
Students playing with impressive racks? Yes, it's cluster comp time
The most comprehensive coverage the world has ever seen. Ever
Run little spreadsheet, run! IBM's Watson is coming to gobble you up
Big Blue's big super's big appetite for big data in big clouds for big analytics
Seagate's triple-headed Cerberus could SAVE the DISK WORLD
... and possibly bring us even more HAMR time. Yay!
prev story

Whitepapers

Secure remote control for conventional and virtual desktops
Balancing user privacy and privileged access, in accordance with compliance frameworks and legislation. Evaluating any potential remote control choice.
Intelligent flash storage arrays
Tegile Intelligent Storage Arrays with IntelliFlash helps IT boost storage utilization and effciency while delivering unmatched storage savings and performance.
WIN a very cool portable ZX Spectrum
Win a one-off portable Spectrum built by legendary hardware hacker Ben Heck
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Beginner's guide to SSL certificates
De-mystify the technology involved and give you the information you need to make the best decision when considering your online security options.