Feeds

Ex-Google engineer dubs Goofrastructure 'truly obsolete'

MapReduce and BigTable as 'ancient, creaking dinosaurs'

Security for virtualized datacentres

Google downs shot of espresso

Last year, in an interview with the Association for Computer Machinery (ACM), a Google engineer acknowledged that GFS was unsuited for low-latency, real-time applications like YouTube and Gmail, and he said that Google was working to build a new version of the file system.

Googler Matt Cutts later told The Register that this "GFS 2" was part of the company's new search infrastructure codenamed Caffeine.

Several months after that, at the launch of Google's Instant search interface, Eisar Lipkovitz, a senior director of engineering at the company, told us that within the company, GFS 2 is known as "Colossus" and that it moves the company's search indexing system off of MapReduce and onto BigTable.

A few weeks later, Google published a paper on Colossus and a new distributed data processing system known as Percolator. But according to Lipkovitz, these platforms were built specifically for search and may or may not be applied to other Google services.

For years, database guru Mike Stonebraker has criticized MapReduce and GFS, and Lipkovitz told us that Google has made "similar observations". MapReduce, he told us, is not suited to calculations that need to occur in near realtime.

Google has also said that the single-master design of GFS is a major limitation. "A single point of failure may not have been a disaster for batch-oriented applications, but it was certainly unacceptable for latency-sensitive applications, such as video serving," said Google's Sean Quinlan in his interview with the ACM. Colossus does not have this limitation.

At the moment, the open source version of Hadoop is burdened with single points of failure. But Facebook is running a version that eliminates these limitations.

In a recent conversation with The Register, Dwight Merriman, the CEO of 10gen, the company that founded the open source MongoDB distributed database, argued that MongoDB is superior to BigTable because it uses a document-oriented data model rather than tabular model.

"Today, 95 per cent of the code we're writing is in an object-oriented language," he said. "We're to the point where object-oriented programming is ubiquitous enough, having a database that works well with that sort of thing is important."

He said that Megastore is an improvement on BigTable, but that it doesn't change the database's fundamental tabular setup, and he added that most of the improvements provided by Megastore are already a part of MongoDB.

Google's coding culture

With his blog post, Prasanna was equally critical of Google's coding culture. But, he says, this was a function of the company's size. "The nature of a large company like Google is such that they reward consistent, focused performance in one area. This sounds good on the surface, but if you're a hacker at heart like me, it's really the death knell for your career.

"It means that staking out a territory and defending it is far more important than doing what it takes to get a project to its goal," he said. "Engineers who simply staked out one component in the codebase, and rejected patches so they could maintain complete control over design and implementation details had much greater rewards."

Prasanna says that he voices these opinions without bitterness. And his post does have a rather even-handed tone. In the past month or two, he says, eight of his colleagues who worked on Google Wave have left the company. Which is hardly surprising. A year after unveiling Google Wave, Google killed development on the project.

Lars Rasmussen – who designed the original Google Maps with his brother Jens before running the Google Wave project – has now defected to Facebook. ®

New hybrid storage solutions

Whitepapers

Providing a secure and efficient Helpdesk
A single remote control platform for user support is be key to providing an efficient helpdesk. Retain full control over the way in which screen and keystroke data is transmitted.
Top 5 reasons to deploy VMware with Tegile
Data demand and the rise of virtualization is challenging IT teams to deliver storage performance, scalability and capacity that can keep up, while maximizing efficiency.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.
Secure remote control for conventional and virtual desktops
Balancing user privacy and privileged access, in accordance with compliance frameworks and legislation. Evaluating any potential remote control choice.