Google crafts neural network to watch over its data centers
Unwittingly creates most irritating and humorless backseat sysadmin ever
Google has put its neural network technology to work on the dull but worthy problem of minimizing the power consumption of its gargantuan data centers.
By doing this, the company says it has created a bit of software that lets it predict with 99.6 per cent accuracy how efficiently its data centers consume electricity, allowing it to make subtle tweaks to reduce consumption.
The company gave details on its neural network on Wednesday. The project began as one of Google's vaunted "20 per cent projects" by engineer Jim Gao, who decided to apply machine learning to the problem of predicting how the power usage effectiveness of Google's data centers would change in response to tweaking one of 19 different inputs.
Power usage effectiveness (PUE) reflects the proportion of power that goes to the stuff supporting its computers versus the power which makes it into the servers and storage and networking boxes in the racks. A PUE ratio of 1.1 is very good and means that for every watt slurped down by some IT gear, 0.1 watts are chugged by supporting infrastructure like cooling systems or other facility infrastructure.
For Google, lowering its PUE is a crucial way for the company to decrease its voluminous electricity bills. Gao's machine learning approach helped it do this and was effective enough for Google to use in production.
"For example, a couple months ago we had to take some servers offline for a few days – which would normally make that data center less energy efficient," the company explained in a blog post. "But we were able to use Jim’s models to change our cooling setup temporarily – reducing the impact of the change on our PUE for that time period. Small tweaks like this, on an ongoing basis, add up to significant savings in both energy and money."
The neural network used for these data center predictions used 19 inputs, each coming with around 180,000 data points that had been gathered over the course of two years. The input data included things like the total server IT load in kilowatts, the total number of condenser water pumps running, the mean heat exchange approach temperature, the outdoor wind speed in miles per hour, and so on.*
By hooking these inputs up to the neural network, Google is able to not only predict how power usage effectiveness will change over time, but it can also use it to simulate changing one of the inputs and see what effect it will have on PUE. This gives the company capabilities that are otherwise difficult to achieve, as figuring out how the 19 inputs interact is akin to playing three-dimensional chess in one's head while rollerskating backwards, we reckon.
Google's PUE-prediction system is accurate enough for use in production, the company says
"Actual testing on Google DCs [data centers] indicate that machine learning is an effective method of using existing sensor data to model DC energy efficiency, and can yield significant cost savings. Model applications include DC simulation to evaluate new plant configurations, assessing energy efficiency performance, and identifying optimization opportunities," wrote Gao in a paper describing the approach.
Though Google is likely hoping for a rapture-of-the-nerds reception to this news, it's worth pointing out that this is not a new approach, nor is Google the first to apply it to data center monitoring. Neural networks have been around in one form or the other since the 1960s and many of Google's systems rely on a 1980s refinement termed a Boltzmann machine. Moreover, AI pioneer Jeff Hawkins has developed a more sophisticated system that is used to power a commercial product named Grok, which already does server load monitoring and prediction using a more theoretically rigorous brain-like AI.
However, the amazing thing about this news is the implication that neural networks are now so well understood that a data center admin can gin up a model based on some (massaged) inputs, train it, and create a system that can predict with startling accuracy how a jumble of inputs combine and lead to a single specific output. And that's wonderful. ®
* "Data pre-processing such as file I/O, data filtration, and calculating meta-variables was conducted using Python 2.7 in conjunction with the Scipy 0.12.0 and Numpy 1.7.0 modules. Matlab R2010a was used for model training and post-processing. Open source alternatives offering similar functionality to Matlab R2010a include Octave as well as the Scipy/Numpy modules in Python," Gao explained in a more detailed paper describing the approach.
Sponsored: RAID: End of an era?