Google Spanner — instamatic redundancy for 10 million servers?
Mountain View wants your exabyte
Google’s favorite sentence
Asked if he what he would do if he could “wave a magic wand” to create a back-end net technology that “we don’t have today,” Gill waxed cryptic about Google’s famously distributed online infrastructure — which treats data centers as “warehouse-scale” machines — touching on the idea of moving loads from any data center that’s in danger of overheating.
“What we are building here...is warehouse-sized compute platforms,” Gill said. “You have to have integration with everything right from the chillers down all the way to the CPU.
“Sometimes, there’s a temperature excursion, and you might want to do a quick load-shedding — a quick load-shedding to prevent a temperature excursion because, hey, you have a data center with no chillers. You want to move some load off. You want to cut some CPUs and some of the processes in RAM.”
And he indicated the company could do this automatically and near-instantly — meaning without human intervention. “How do you manage the system and optimize it on a global level? That is the interesting part,” Gill continued.
“What we’ve got here [with Google] is massive — like hundreds of thousands of variable linear programming problems that need to run in quasi-real-time. When the temperature starts to excurse in a data center, you don’t have the luxury to sitting around for a half an hour… You have on the order of seconds.”
Asked if this was a technology Google is using today, Gill responded with one of Google’s favorite sentences. “I could not possibly comment on that," he said. When we later asked uber Googler Matt Cutts about this — with a Google PR man listening on the line — Cutts gave another Googly response: “I don't believe we have published any papers regarding that,” he said.
But it would seem that Gill was referring to Spanner. And judging from Dean’s presentation, the technology has already been deployed. As reported by Data Center Knowledge, Google has also said that its new data center in Saint-Ghislain, Belgium, operates without chillers. Apparently, Spanner is used to automatically move loads out of the Belgium facility when the outside air gets too hot during the summer.
Additional information is sketchy. Dean refers to Spanner as a “single global namespace,” with names completely independent of the location of the data. The design is similar to BigTable, Google’s distributed database, but rather than organizing data in rows, it uses hierarchical directories.
Dean also mentions “zones of semi-autonomous control,” indicating that Google splits its distributed infrastructure into various subsections that provide redundancy by operating independently of each other.
The goal, Dean says, is to provide access to data in less than 50 milliseconds 99 per cent of the time. And Google aims to store data on at least two disks in the European Union, two in US, and one in Asia.
But one has to wonder how far this technology has actually progressed. Over the past year, two much-discussed Gmail outages occurred when Google was moving workloads between data centers.
Clearly, Google has talent for distributed computing. But it also has talent for leaking just enough information to make you think it must doing something that no one else could possibly do. ®