Managing Director, Freeform Dynamics
It’s certainly true from our research that when it comes to servers, power consumption doesn’t top the list for the majority of organisations. It either matters little or, for a smaller number of companies, it matters a great deal. No doubt this latter group are the ones that have hit the threshold of how much power they can get into their data centres, or indeed racks. As one CIO told us, "those vendors can keep banging on about their latest blade technologies, but it’s all no good if we can’t even get enough power into the rack to drive them."
Ignorance may be bliss, but for many companies it can be difficult to get a handle on power consumption. IT managers have little visibility on such things – even if power is being measured at a useful level of granularity (e.g. to the server room, or even better to the rack or piece of equipment), such information may be difficult to extricate from facilities management.
Often it is only when something has gone wrong that anyone takes notice. I can remember taking delivery of a number of new servers, only to find that the circuits kept tripping every time they tried to boot up – which was quite often given that (indeed) the circuits kept tripping. Eventually, FM took it upon themselves to turn up with a monitoring device such that the faults could be diagnosed, and the right amount of electricity served up. I’m sure this wasn’t an isolated case.
Newer generations of servers do have far better power monitoring capabilities. Meanwhile, software solutions such as NightWatchman from 1E can enable both monitoring and management of server power usage - and of course external devices (some operating out-of-band) can monitor wattage.
In terms of where to start, this is absolutely a case of ‘can’t measure, can’t manage’ – so visibility is everything. In the first instance, look for what information you can gather about how much power is being used – and how close the usage is, compared to any thresholds that may be understood. This will help you understand what risks (if any) you face.
A second step is to consider opportunities for cost savings for reduced power consumption. This provides a double whammy – savings can offset the costs of any additional hardware and software required to run a more efficient shop. In addition, reduced consumption lowers any risk of hitting arbitrary limits.
Finally and over time, power can just become one more thing to monitor and respond to. This may require process and organisational change, but the organisation needs to decide exactly where responsibility for power management should lie, and act accordingly. ®
Are you an Expert?
If you think you've got what it takes to advise your peers on matters of tech decisions, write us a response to the question above and email it to firstname.lastname@example.org.
Monitoring and managing power consumption
All good, common sense
Having gone through much of the process described, I can say that the comments are all good. OK, so we're definitely in the 'small' category with three racks and about 10kW of load, but the process is the same.
A year ago, we had an unmanaged and unmanagable setup - multiple small UPSs (all on their limit), no proper power system, no cable management, and what could best be described as a mess. We now have an expanded server room, with 3 new 47U racks, a single large modular UPS, and proper power distribution and network cable management. We get load data from the UPS, so we have graphs of that, plus graphs of air temperatures (we use just airflow for cooling).
ONe thing not mentioned, servers do indeed increase their power consumption according to environmental conditions. Overall, our power consumption varies by around 5% with inlet air temperature once it goes above about 20˚C. Since not all the servers actually have any power management (eg variable fan speeds), I suspect that some of the servers vary their loads by more than 5%.
The big challenge now is persuading the boss that we do need to upgrade the ventilation/cooling system - we only just managed last June, and at times we had servers raising over-temp alarms. I suspect that now it's cooled down, the boss will decide we don't need to do anything, and it will be too late when it turns hot again.