Google's Urs Hölzle: If you're not breaking your own gear, you aren't ambitious enough
Infrastructure king on next-gen memory, FPGAs, and more
Interview In the past fifteen years Google has gone from being a consumer of tech to an inventor of technologies, and in doing so has had profound effects on the modern web.
One of the key people behind that shift has been Urs Hölzle, who joined the company as its eighth employee and now serves as a senior vice president of technical infrastructure and one of its "Google Fellows".
One of Hölzle's main jobs is to plan out the technologies Google needs to use, how it needs to use them, and what paths it absolutely shouldn't go down.
At the GigaOm Structure conference in San Francisco, he sat down with The Register for an interview about what Google thinks of next-gen memory technologies, whether lashing FPGAs to CPUs is a good idea, how distributed systems needs to be run and managed, and which aspects of Google's own for-sale cloud services can benefit from the company's internal infrastructure.
What follows is a transcript of that conversation that has been edited for clarity and brevity.
How far out does Google need to look when it comes to the types of hardware components you contemplate buying in a few years? Specifically, what do you think about next-generation memory technologies?
The things you focus most of your time on are nine months out, like this is your next generation that you're right now developing and then there's what we call n-plus-one, that's usually where you work concurrently on the thing after that that so you already have concrete prototypes or whatever but it's not ready, the silicon isn't really available, so you try things out.
PCM or memristors or whatever is what you have testbeds or simulations for, but you have no comprehension of timeline because you don't know when they're going to be available.
There's a number of these in the air - silicon photonics, different kinds of storage - and I think the way we look at it is you have to be prepared, you have to play with these things to understand what they look like.
You can't really anticipate. Memristors, three years ago, were being announced as nine months in the future, and now they're due 2017 or maybe end of decade, so, you know, TBD.
The other thing is often to really take advantage of a new technology, you need to have it at least partially available because you need to go and say 'how would I rewrite search' in order to use this. If you just have a simulation, it's a billion times slower than the real thing, so there's only so much you can do in figuring out 'what would you do if'. The truth is normally something like 18 months or 24 months is enough to get it done by the time the thing is actually production ready.
There's a tension between centralizing the systems providing features, and distributing them across your infrastructure so you can be flexible at the cost of speed. How does Google decide where it needs to be on that difference?
The key thing is that you can't be religious about it. Things change, and I think in the next five years there's at least the chance that technology will change much more meaningfully than they have in the past five years.
Exactly how that works out, that really depends on the specific factors. If something improves [in performance] by a factor of two or factor of eight, that really changes how you react to it.
The important thing is you don't get too set on one approach. Disaggregation is a great thing but it's not the only thing to pay attention to: there may very well be times when disaggregation is much less important than something else. For example, maybe you want to package things more closely at some point.
You and some others at Google came up with the idea that 'the data center is the computer'. What are some of the implications of treating DCs in that way, and have you run into any problems that you didn't anticipate?
You always run into problems. We've really, for the last ten years, easily been at the forefront of trying to solve these problems and you get things wrong all the time. That comes with the territory. If it doesn't, you probably didn't try something ambitious enough. One of the big advantages of software is that it's much more malleable. You can go full hog in the wrong direction, and once you realize it, changing direction isn't that expensive and it doesn't take that much time.
With hardware, you have sunk cost, you have this thing, and you've built it and you've spent the money on it and you can't really refurbish it or change it very much. So the more flexibility you put in, and the more control is outside the box, the easier it is to react to new demands.
We've been big supporters of [networking protocol] OpenFlow, for example. On a traditional networking box you have millions of lines of software in it. On OpenFlow you have thousands of lines of software in it - really just enough to control the box and the fans and program the chips, but really all the intelligence is elsewhere, and that allows you to change them.
Like, you have a new routing [scheme] and it works for multiple boxes because the box never knew what routing was so therefore you don't have to update it. [The boxes are] really focused on the hardware design; someone else tells them how to program their hardware tables, et cetera. The boxes don't really know that they're implementing VPN or some routing. That's number one.
Number two is the larger your pool is, or the more you think about things as pooled resources, the easier it is to be flexible about how that's being used.
When we think about memory, what's the right ratio of memory to CPUs, it's much easier if you can think about the pool. Like, here's a cluster, do I have enough memory in the cluster as a pool and if not, I don't need to upgrade every single machine – I need to add enough memory to the pool and then the cluster management system can figure it out and put the high memory jobs on the high memory machines.
It's much easier to evolve things that way than to say 'wow, actually I thought I need 16 gigs [of RAM] and now I realize I need 19 gigs and I have to go into every machine and put in 3 gigs ... oh wait I can't put in 3 gigs, the minimum increment is 8, and then I'm going to throw out the existing DIMM slot I have because there's only so many slots.'
If you think just about the box, that gets very awkward over a three-year timeframe in a field like ours where the requirements change all the time, and applications change all the time. By managing things in software you get more flexibility, and by pooling things you get more flexibility as well.