Who the hell cares about five nines anymore?
The service-centric view of system management
Does anyone really care what “five nines” means anymore? For the record, it means 99.999 per cent availability, which means your business managers can founder in digital limbo for just over an eighth of a second each day.
That doesn’t sound too bad, does it? What about six seconds a week, or roughly five minutes a year?
Talk is cheap
Back in the mid-80s, when businesses were expected to talk the IT department’s language, all this may have meant something. These days it is like discussing angels dancing on the head of a pin. Business managers just want the damn stuff to work when they need it so they can make money. End of story.
In the modern world, what does systems management look like? How can IT managers make sure that systems are serving the needs of business users, rather than just obsess about numbers?
“Five nines is an anachrononism now,” says Bill Roth, executive vice-president of business intelligence firm LogLogic.
“People need uptime all the time, but the language of how users request service-level agreements has changed.”
These days, we want computing resource at hand whenever we want it, which he calls “effectively all nines”. In truth, of course, there will be many times when we don’t need our computing resource, for example overnight when staff are not around.
That means IT departments must look at the IT system as a whole, rather than focusing on individual components. Analysing events from multiple components in the IT infrastructure can give administrators a better idea of what is really happening and its effect on users. LogLogic builds parsers and semantic recognition systems to interpret not just the meaning of a particular error code, but its implications.
The problem for most administrators is that their systems have been developed over many years, says Hamish MacArthur, co-founder of analyst MacArthur Stroud. The result is a spaghetti nightmare.
"There is an underlying complexity that virtualisation is exacerbating"
"You end up with a situation where people have a whole bunch of different tools taking things in at different times," he says. "There is an underlying complexity that virtualisation is exacerbating."
The other part of the problem is that those tools often were not purchased by their users. “Those managing the infrastructure gave out promises to managers, based on the assumption that they know how the system is working so they don't need to monitor what is happening," MacArthur says.
IT departments have not always included the cost of additional systems management tools in the budget. But as businesses require more certainty about system performance, IT departments will find themselves needing to charge for both performance and systems management tools.
Bringing all of these tools together is harder than it sounds. Different systems management environments operate well for particular groups of applications, but administrators need to manage systems across entire collections of business units. They are required to maintain a picture of licensing costs as well as running costs.
The unified data centre movement, as typified by, say, Cisco’s UCS, goes only some way towards solving this problem.
The idea is that a single, integrated system can be more easily managed from a single point. Everything is pre-configured and should hum along nicely, presenting a holistic, bird’s eye view of events to the management console.
But these unified servers are limited in scope. Companies may deploy them incrementally, when expanding into a new data room or implementing a new set of applications, but they rarely use them to replace their entire computing infrastructure.
Stack 'em low
So organisations end up with a tightly-honed cell of computing infrastructure as part of a chaotic, legacy environment. The same is true of integrated stacks such as VMware, EMC and Cisco’s VBlocks, NetApp’s FlexPod, or Oracle’s Exadata.
Perhaps that is why integrated stacks are selling in relatively low numbers, according to analysts.
Elevating our view of the system involves taking a more service-centric view of systems management, in which the applications provided to the user, rather than a particular hardware component, are the prime consideration. These services generally stretch across many components.
IT departments can divide their data centres into synchronous, asynchronous and standalone applications and services. The synchronous ones are the most critical in terms of uptime. The asynchronous applications may be able to stay down for hours at a time, and the standalone ones may be used for only a couple of days each month.
Compartmentalising systems in this way and looking at real-world user requirements will make systems management cheaper than simply trying to hit a set number of nines across the board.
There are some key success factors in stitching together a “Franken-management” approach to service-oriented monitoring. Look out for another article on how to make it work. ®