Google Research: Three things that MUST BE DONE to save the data center of the future
Think data-center design is tough now? Just you wait
ISSCC One prominent member of Google Research is more concerned with the challenges of speedily answering queries from vast stores of data than he is about finding business intelligence hidden inside the complexities of that omnipresent buzzword, "big data".
"I think we're all trying to figure out what that is," he said. "And we have this vague idea that it has to do with analytics for very, very large volumes of data that are made possible by the existence of these very, very big data centers – which I'm calling today 'landhelds' to make fun of things like 'handhelds'."
Barroso said that from Google's point of view, big data isn't something to which you pose complex analytical queries and get your response "in the order of minutes or hours." Instead, it's something that feeds what he refers to as "online data-intensive workloads."
Boiled down to their essence, such workloads should be thought of as "big data, little time," he said. In other words, the ability to squeeze a response out of a shedload of data as quickly as 100 milliseconds or less.
That's not an easy task, though Google is managing to keep ahead of the "ridiculous amount of data" coursing through its data centers. However, the problem is about to get much worse.
"If you think that the amount of computing and data necessary for Universal Search at Google is an awful lot – and it is – and that responding to user queries in a few milliseconds is very hard, imagine what happens when we move to a world when the majority of our users are using things like Google Glass," Barroso said.
He quickly pointed out that he wasn't talking only about Google's head-mounted device, but of voice queries in general, as well as those based upon what cameras see or sensors detect. All will require much more complex parsing than is now needed by mere typed commands and queries, and as more and more users join the online world, the problems of scaling up such services will grow by leaps and bounds.
"I find this a very, very challenging problem," he said, "and the combination of scale and tight deadline is particularly compelling." To make such services work, response times need to remain at the millisecond level that Google's Universal Search can now achieve – and accomplishing those low latencies with rich-data inputs and vastly increased scale ain't gonna be no walk in the park.
Oh, and then there is, of course, the matter of doing all that within a power range that doesn't require every mega–data center to have its own individual nuclear power plant.
Barroso identified three major challenges that need to be overcome before this "big data, little time" problem can be solved: what he referred to as energy proportionality, tail-tolerance, and microsecond computing.
Energy in the right proportions
By energy proportionality, Barroso means the effort to better match the energy needs of entire systems to the workloads running upon them, an effort that in a paper published in 2007, Barroso and his coauthor, Google SVP for technical infrastructure Urs Hölzle, said could potentially double data-center servers' efficiency in real-world use, particularly in improvements of the efficiency of memory and storage.
Unfortunately, although a lot of progress has been made in the area of energy proportionality, it has often come at the expense of latency. "Many of the techniques that give us some degree of energy proportionality today," Barroso said, "cause delays that actually can be quite serious hiccups in the performance of our large, large data centers."
So that's challenge number one. The number two item that need to be handled when attempting to assemble highly responsive mega–data centers is the somewhat more-arcane concept of tail-tolerance.