Monitoring and managing power consumption
Reg Readers tell it like it really is
You the Expert We set you a challenge to join our expert panel and answer questions from our readers on how to deal with your server challenges.
This week we've got the first of a series of instalments on this topic. We welcome the first contribution from our resident reader experts, Adam Salisbury and Trevor Pott.
You can read their advice, along with advice from Intel and Freeform Dynamics, below.
The question they tackle this week is:
Not all organisations perceive power consumption as a major challenge – until limits are reached. What can be done to pre-empt this situation, and what level of monitoring is both achievable and useful, given today’s server environments?
Until very recently power management and energy efficiency have both been areas that have neither required nor received any time, money or attention from the business. It has not been until now, when we are more carefully considering our impact on the environment and trying to squeeze the most out of every last penny, that our focus has turned to how much power our systems are using.
For some enterprise class organisations, entire teams exist who are devoted to managing power consumption, efficiency and cooling. But what exists for the small business, struggling to cut costs and improve efficiency when it matters most? Well a good start is using power calculators and capacity planners, now provided by most big server vendors. They are a potent and essential tool for establishing baseline power consumption figures, no matter what the size of your infrastructure and can be used to measure future savings.
For a smaller organisation, which has never previously considered addressing power and cooling issues, simple things - like cable management arms to improve exhaust airflow and using blanking panels to increase the flow of air through, rather than around, the servers - will reduce the cost of keeping racks cool. Calculating the heat output of servers and then organising them in the rack accordingly will also improve airflow efficiency.
If you’ve got an infrastructure big enough that your equipment spans aisles and not just racks, then it’s worth using the hot/cold aisle system. This is now almost a de facto standard in most heavyweight data centres.
Having the aforementioned baseline statistics for power consumption will allow you to more accurately justify the replacement of legacy servers and UPS equipment. This will leverage the vastly improved power efficient designs and technologies which have only recently been developed. In some instances, a new equivalent spec server or UPS can be up to 70% more efficient than its legacy counterpart.
Considering the potentially huge savings to be made from utilising this new green technology, now is the time to consider consolidation and/or virtualisation. A lot of companies are already embracing server virtualisation, but many are still re-deploying old servers as hypervisors rather than buying new equipment, which would reap yet more savings.
Investing in power monitoring systems for your infrastructure will almost certainly result in considerable savings being made. Solutions exist which can provide a wealth of highly granular information from individual servers to power distribution units - from racks to aisles of racks. All manner of information about both quantity and quality of power can be divined from systems like these and for some, all that is needed is to see this data before obvious inefficiencies are identified and remediated.
However, if your infrastructure doesn’t comprise a sprawling data centre spanning thousands of square feet, monitoring can only help so much. The high-end solutions that exist would be hard to justify for a medium sized business with an office-sized server room. Such a company would be limited in what they could do with the data.
Whatever route you choose, plan carefully, measure and if executed correctly, you can make a significant difference to your organisation's bottom line.
Manager of the Enterprise Technical Specialist team in EMEA, for Intel.
Intel is very serious about power consumption. We are looking at this issue from several aspects. Some of these are more high level, and will address the issue over the long term, others address the issue straight away and still others come from collaboration with independent software vendors (ISV's).
Intel's foundation for its policy on addressing data centre energy efficiency is our continued pursuit of the tick-tock cadence model. A Tick delivers a new processor technology and a Tock is an entirely new microarchitecture. Continued focus on microprocessor innovation assures that processor performance will continue to improve, providing increased processing per watt of power consumed.
Tick tock also allows us to drive increased energy efficiency in our micro architecture design, as most recently evidenced by the 50% average lower server idle power in Intel® Xeon® 5500 processor series based systems, compared to the previous generation.
One energy efficiency feature - Intel® Intelligent Power Technology - puts power management in all platform components, including the processor, chipset, and memory. This enables operating systems to put processor power and memory into the lowest available states needed to support current workloads without compromising performance, and allows individual cores to be idled independent of the others.
With these new advantages, customers are seeing incredible payback from the deployment of new servers in replacement of servers four years old (as little as 9 months) with much of this savings coming from decreased energy bills. Intel’s server refresh ROI tool helps determine payback periods for specific environments. Through Intel innovation, we continue to work with ISV’s to increase the scalability of virtualised environments which will allow higher virtualisation ratios per server and will continue to free up additional power budget.
Intel is also working on technologies like server power capping, featuring technologies like Intel® Intelligent Power Node Manager and Intel® Data Center Manager. IT managers are able to set very sophisticated policies to control power use within a single system, at the rack level, or even manage the entire datacentre. These policies allow a higher rack density level based on regulating the power usage of Intel servers.
Further information and a whitepaper describing a proof of concept study at BMW can be found here
Infrastructure Support Engineer
Power consumption and cooling are two issues that cannot be disentangled. Eventually you will put too many servers on a single breaker, or notice that servers in the back closet don’t work anymore if you close the door. Once a company has been faced with this realisation, they have begun the long road towards datacentre design.
Every watt consumed by a server is a watt you have to figure out how to cool. For the small business with a single server under a desk, neither cooling nor power consumption are likely to be an immediate issue. When you are small, power and cooling can be as simple as a second breaker and leaving the door open. When you are Google, both factors are such important considerations that they determine the location and specific design of billion dollar data centres. Somewhere in between is the real world of everyday datacentre operations.
Simple tools like a kill-a-watt or a power distribution unit (PDU) with a built-in ammeter can tell you what the power draw of your servers are. Test the servers under heavy stress, as modern servers are fairly good at backing down their power consumption when they are idle. You need to get an idea of what they are pulling when loaded. Once you know how much power the servers consume you can start doing a little basic maths to see how much power you will need for future expansion. Factor in that you have to cool all of that heat, and that the chillers will cost you power as well.
In the long run, you will need to measure a server’s power draw over time. Any decent uninterruptible power supply (UPS) should be able to give you information on how much load is being drawn from it. UPSs are often equipped to send statistics on power usage to a central monitoring server from which you can collect information for later analysis. Like UPSs, the nicer PDUs are networked and are granular enough to allow per-socket monitoring.
Monitoring of the chillers duty cycles can help give you an idea of how much headroom you have to add servers to the datacentre. If the chillers are fully engaged for the entirety of your datacentre’s peak period, it’s probably not a good idea to add servers without adding chillers. Another good idea is to invest in thermometers that can record statistics. You can get them as network-attached devices, and they are fantastic at helping you find hot spots in your datacentre.
Proper power planning goes beyond simply ensuring that you have enough breakers pulled into your datacentre. It also means ensuring that you aren’t overtaxing your UPSs. Monitor your usage, plan for peak consumption, and above all leave yourself headroom for growth.
Managing Director, Freeform Dynamics
It’s certainly true from our research that when it comes to servers, power consumption doesn’t top the list for the majority of organisations. It either matters little or, for a smaller number of companies, it matters a great deal. No doubt this latter group are the ones that have hit the threshold of how much power they can get into their data centres, or indeed racks. As one CIO told us, "those vendors can keep banging on about their latest blade technologies, but it’s all no good if we can’t even get enough power into the rack to drive them."
Ignorance may be bliss, but for many companies it can be difficult to get a handle on power consumption. IT managers have little visibility on such things – even if power is being measured at a useful level of granularity (e.g. to the server room, or even better to the rack or piece of equipment), such information may be difficult to extricate from facilities management.
Often it is only when something has gone wrong that anyone takes notice. I can remember taking delivery of a number of new servers, only to find that the circuits kept tripping every time they tried to boot up – which was quite often given that (indeed) the circuits kept tripping. Eventually, FM took it upon themselves to turn up with a monitoring device such that the faults could be diagnosed, and the right amount of electricity served up. I’m sure this wasn’t an isolated case.
Newer generations of servers do have far better power monitoring capabilities. Meanwhile, software solutions such as NightWatchman from 1E can enable both monitoring and management of server power usage - and of course external devices (some operating out-of-band) can monitor wattage.
In terms of where to start, this is absolutely a case of ‘can’t measure, can’t manage’ – so visibility is everything. In the first instance, look for what information you can gather about how much power is being used – and how close the usage is, compared to any thresholds that may be understood. This will help you understand what risks (if any) you face.
A second step is to consider opportunities for cost savings for reduced power consumption. This provides a double whammy – savings can offset the costs of any additional hardware and software required to run a more efficient shop. In addition, reduced consumption lowers any risk of hitting arbitrary limits.
Finally and over time, power can just become one more thing to monitor and respond to. This may require process and organisational change, but the organisation needs to decide exactly where responsibility for power management should lie, and act accordingly. ®
Are you an Expert?
If you think you've got what it takes to advise your peers on matters of tech decisions, write us a response to the question above and email it to firstname.lastname@example.org.