Feeds

Google Go boldly goes where no code has gone before

How to build all the Google stuff Google won't talk about

Build a business case: developing custom apps

Every Google data center has a Chubby, and Heroku wanted one too.

Like the rest of Google's much admired back-end infrastructure, Chubby is decidedly closed source, so Heroku men Keith Rarick and Blake Mizerany built their own. Although they didn't have source code, they did have one of those secret-spilling Google research papers. And they had Go, Google's open source programming language for building systems like Google's.

Running its myriad online services atop a famously distributed back-end, splitting tasks into tiny pieces and spreading them across a vast network of machines, Google needs a way of controlling access to those machines – and that's Chubby. According to a 2006 Google research paper (PDF), there's a least one Chubby instance in each of the company's data centers, coordinating server access for its GFS distributed file system, its BigTable database, and its epic number-crunching platform, MapReduce.

Heroku runs a similarly distributed infrastructure behind its eponymous web service – a means of building, hosting, and readily scaling Ruby on Rails applications – and this too requires the sort of server-juggling you get from Chubby. "When you have a lot of distributed processes – lots of things going on, a network spanning a lot of machines – coordinating things can be tricky," Heroku system architect Keith Rarick tells The Register.

"If a machine goes down or you're having network problems where some machines can't talk to each other or the network is slow or you're getting packet loss, there are so many ways things that can fail. It can be really, really hard to get all those things right if you're doing ad hoc coordination."

Google Chubby

Only known image of Google Chubby

Using an algorithm called Paxos, Rarick says, Chubby almost magically simplifies this process. But first, you have to build the thing. And for that, Rarick turned to Go. Conceived by a trio of Google heavyweights – Unix founding father Ken Thompson, fellow Bell Labs Unix developer Rob Pike, and Robert Griesemer, who worked on the Java HotSpot compiler – Go is specifically designed for distributed systems.

"We realized that the kind of software we build at Google is not always served well by the languages we had available," Rob Pike tells us. "Robert Griesemer, Ken Thompson, and myself decided to make a language that would be very good for writing the kinds of programs we write at Google."

Released as an experimental language in late 2009 and now in production at Google, Go includes built-in mechanisms for running concurrent tasks. Using lightweight processes known as "goroutines", you can readily juggle multiple tasks within the same operating system thread or spread them across disparate threads, and all this is automatically scheduled by the Go runtime. The inherently concurrent setup was ideal for Heroku's Chubby mimic, an open source project known as Doozer.

"Go is very good at letting you have multiple points of control in a single program, and coordinating the synchronization and communication between those different points of control," Rarick explains. "That's something that's really useful in a project like [Doozer]. We have a bunch of different processes on different machines trying to talk to each other, and you have to make sure they all have a consistent view of the world."

Go not only provides the concurrency Doozer requires, it keeps memory usage to a minimum. It offers garbage collection, but it also frees you from thread overload. "We don't have to worry about how many threads we're running. We aren't constantly saying 'Are we running out of resources?'" Rarick explains. "We can have lots of goroutines instead, and that's built into the language."

Other languages provide somewhat similar mechanisms for concurrency – such as Erlang and Scala – but Go is designed to provide maximum efficiency and control, as well. "It comes from a line of systems programming language like C and C++, so it gives you the ability to really control the performance characteristics," Rarick says. "When it comes time to measure things and make sure they run fast, you have the flexibility to really get in there and do what you need. And when you figure out why your program is being slow, you really have the control you need to fix it."

Keith Rarick

Keith Rarick

It's a unique combination. "C gives you control, but it's not good for concurrency. It doesn't even give you garbage collection," he says. "Go gives you concurrency and garbage collection, but it still gives you control over memory layout and resource use."

Google won't say how Go is used inside its mystery data centers. But Doozer – officially released earlier this month and set to go live at Heroku – is a suitable poster child for the fledgling language. It's no coincidence that Robert Griesemer, one of the original Go architects, also worked on Chubby.

Gartner critical capabilities for enterprise endpoint backup

More from The Register

next story
Why has the web gone to hell? Market chaos and HUMAN NATURE
Tim Berners-Lee isn't happy, but we should be
Microsoft boots 1,500 dodgy apps from the Windows Store
DEVELOPERS! DEVELOPERS! DEVELOPERS! Naughty, misleading developers!
'Stop dissing Google or quit': OK, I quit, says Code Club co-founder
And now a message from our sponsors: 'STFU or else'
Apple promises to lift Curse of the Drained iPhone 5 Battery
Have you tried turning it off and...? Never mind, here's a replacement
Linux turns 23 and Linus Torvalds celebrates as only he can
No, not with swearing, but by controlling the release cycle
Scratched PC-dispatch patch patched, hatched in batch rematch
Windows security update fixed after triggering blue screens (and screams) of death
This is how I set about making a fortune with my own startup
Would you leave your well-paid job to chase your dream?
prev story

Whitepapers

Top 10 endpoint backup mistakes
Avoid the ten endpoint backup mistakes to ensure that your critical corporate data is protected and end user productivity is improved.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Backing up distributed data
Eliminating the redundant use of bandwidth and storage capacity and application consolidation in the modern data center.
The essential guide to IT transformation
ServiceNow discusses three IT transformations that can help CIOs automate IT services to transform IT and the enterprise
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.