Feeds

Google blesses Hadoop with MapReduce patent license

Safety for stuffed elephants

Choosing a cloud hosting partner with confidence

Three months after securing a patent for MapReduce - the distributed number-crunching platform that underpins its world-spanning infrastructure - Google has granted a license to Apache Hadoop, easing infringement concerns hovering over the MapReduce-mimicking open source project.

Apache legal counsel Lawrence Rosen announced the news with an email to the Apache board late last week, and his note was soon posted to a public mailing list.

Rosen did not immediately respond to a request for comment. Nor did Google.

In mid-January, Google won a US patent for "a system and method for efficient large-scale data processing." The patent - which you can see here - describes a means of splitting data-crunching tasks into tiny sub-tasks and mapping them across distributed machines, before reducing the results into one master calculation.

Google uses this MapReduce setup to crunch data across the massive distributed infrastructure buttressing its online services, and though the platform is proprietary, the company published a research paper describing its basic setup in December 2004.

This paper - along with a sister paper describing the company's GFS distributed file system - became the basis for Hadoop. The platform was originally developed by Doug Cutting to back his Nutch open source web crawler, and it was eventually open sourced at Apache. Famously, it's named for a yellow stuffed elephant that belongs to Cutting's son.

When Google won its patent, the general assumption among the Hadoop community was that it posed no threat to the open source project. "Google has lots of patents, and it basically has no track record of using those patents offensively, either involving licensing or pursuing people for infringement," said Mike Olson, chief executive of Cloudera, a company that has commercialized Hadoop in Red Hat-like fashion.

Then he pointed out that Google is a member of the Open Invention Network patent pool, which grants licenses for patented technology in an effort to promote Linux. "All of this convinces us that this is a strategic move from Google and not something that is aimed at the head of any Hadoop adopter or satellite company - Cloudera included."

What's more, Google has long used Hadoop as a way of exposing potential hires to its "Big Data" ways. And even if the company did take legal action, you have to wonder how well its patent would hold up. The map and reduce functions described by Google have been a part of parallel programming for decades.

But now, Mountain View has officially eased fears of legal action. Rosen writes in the email: "Several weeks ago I sought clarification from Google about its recent patent 7,650,331 ["System and method for efficient large-scale data processing"] that may be infringed by implementation of the Apache Hadoop and Apache MapReduce projects. I just received word from Google's general counsel that 'we have granted a license for Hadoop, terms of which are specified in the CLA [contributor licensing agreement].'"

It's unclear what the terms are. But they seem to have passed muster. "I am very pleased to reassure the Apache community about Google's continued generosity and commitment to ASF and open source", Rosen continued.

This is good news not only the likes of Cloudera, but for many of the industry's biggest names as well. The open source Hadoop now underpins everything from Facebook and Yahoo! to, believe it or not, portions of Microsoft Bing. ®

Intelligent flash storage arrays

More from The Register

next story
Euro Parliament VOTES to BREAK UP GOOGLE. Er, OK then
It CANNA do it, captain.They DON'T have the POWER!
Download alert: Nearly ALL top 100 Android, iOS paid apps hacked
Attack of the Clones? Yeah, but much, much scarier – report
NSA SOURCE CODE LEAK: Information slurp tools to appear online
Now you can run your own intelligence agency
Post-Microsoft, post-PC programming: The portable REVOLUTION
Code jockeys: count up and grab your fabulous tablets
Twitter App Graph exposes smartphone spyware feature
You don't want everyone to compile app lists from your fondleware? BAD LUCK
Microsoft adds video offering to Office 365. Oh NOES, you'll need Adobe Flash
Lovely presentations... but not on your Flash-hating mobe
prev story

Whitepapers

Seattle children’s accelerates Citrix login times by 500% with cross-tier insight
Seattle Children’s is a leading research hospital with a large and growing Citrix XenDesktop deployment. See how they used ExtraHop to accelerate launch times.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
Why CIOs should rethink endpoint data protection in the age of mobility
Assessing trends in data protection, specifically with respect to mobile devices, BYOD, and remote employees.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Website security in corporate America
Find out how you rank among other IT managers testing your website's vulnerabilities.