Visions of a hands-free data centre
Time for improvements
In our recent Regcast A DC That Takes Care of Itself, one question came up several times: how well does the type of hardware monitoring advocated by the panel integrate with the monitoring capabilities of other parts of the data centre?
There are two aspects to this question: first, how applicable to your hardware are the monitoring tools currently being delivered? Second, can you use them to build a dashboard to monitor the health of your service, or just the health of your servers?
The ability of a private or hybrid cloud to provide the degree of flexibility and efficiency promised by vendors depends on what our friends at Freeform Dynamics call "holistic operational visibility": a single environment that can see how well you are using all your resources so you can monitor them across every device you manage, inside and outside the data centre.
At a basic level, this helps to define services based on measures of quality that the business can recognise. It means that the need for troubleshooting becomes easier to spot and also easier to predict. And if the IT department wants to edge towards chargeback accounting, it makes that possible too.
Well, that’s the vision. Your reality shows that we are not there yet.
Freeform analyst Tony Lock, known as Dr Stats to regular Regcast viewers, has long experience at the sharp end of data centre management.
Sticky tape and glue
“Readers of The Register have told us time and again that they use a wide variety of management tools every day,” he says.
“At the moment these systems have little integration. IT staff themselves are the integration solution that glues management tools together and allows business services to be delivered effectively.
"Our surveys over several years highlight that many IT professionals would like to have tools that allow them to administer servers, storage and networking from a single console, preferably with security bundled in as well.”
Freeform’s latest report on the topic, A Vision for the Data Centre, was completed in December 2012 and confirms previous research.
About 40 per cent of Reg readers believe that a consistent way to manage data across hosted and in-house systems will take more than 10 years to arrive, if ever. That is a larger group than those who have already achieved this or who believe it will be possible within three years.
So to the first question: can your monitoring tools provide an adequate picture of your entire server estate? We heard from HP on the Regcast about the effort it is putting into building tools to manage the entire lifecycle of the data centre. That means monitoring both your existing estate and your software.
Do the automation
HP's new iLO Management Engine takes advantage of capabilities embedded in the ProLiant Gen8 server to support ongoing diagnostics, using the HP Insight Online dashboard. It won't be limited to Gen8 servers: legacy servers back to generation 4 can be monitored by the same tool.
This will stop angry Reg readers from calling in to demand support
HP says this is important not only because it will stop angry Reg readers from calling in to demand support, but because automation of basic maintenance will drastically cut downtime. The system can also complete some monitoring of HP storage and networking solutions.
The IDC white paper Through Hardware Innovation Comes Support Automation points out: “The ability to easily integrate new and more capable systems into an organisation's support infrastructure is key in understanding the larger picture of the business impact when issues do arise… HP is focusing on the automation of converged infrastructure as the single most effective way to help resolve these customer concerns.”
But there is another dimension to monitoring that is fundamental to the cloud environment: integration. What holds this back?
After the last Regcast I wrote about the challenge of creating multi-disciplinary teams that could work effectively – and any holistic monitoring initiative must by definition be multi-disciplinary.
There is also the problem of what to monitor: do we integrate to discover the most important information or do we simply measure what is available?
Two years ago, Freeform Dynamics produced a survey of Reg readers that considered this question
It showed the two biggest management problems were inefficiencies arising from having to work across multiple interfaces and conventions, and the impossibility of getting a complete view of the data centre. Both categories were rated four or five out of five (five being “major problem”) by about half of respondents.
“Since most enterprises are managing heterogeneous environments with multiple layers of technology providers, coordinating support can pose significant issues,” the IDC report concludes.
“Isolating and diagnosing potential issues can be a back-and-forth nightmare with support providers. This often leads to finger-pointing between providers instead of integrated efforts to solve the problem.”
There are three ways to approach this. The first is integration at the tool level: HP, for example, offers information from Insight Control through vCenter. If there is a problem with a physical server, VMware can move any virtual server that may be affected to a different location.
The second is to allow a third party to monitor remotely: the service provider sees what you see at the same time as you.
The third is that the more information we capture and share in this way, the more we know about recognising potential problems automatically.
HP’s ProLiant Gen8 servers allow you to monitor 1,600 parameters. That is a big dashboard – or potentially the beginnings of an expert system that can learn to catch problems early without human intervention.
In all three cases, automation can provide the payback. You should be able to deliver more without the wage bill going up – or even being aware of the actions being undertaken to preserve the service.
The demands on the IT function in a cloud environment make it imperative that this problem is solved, even if, as some Reg readers believe, it will take 10 years or more.
If internal cloud services are to succeed, IT support staff cannot continue to be glue that hold the data centre together. ®
Sponsored: Benefits from the lessons learned in HPC