Feeds

Google blesses Hadoop with MapReduce patent license

Safety for stuffed elephants

Intelligent flash storage arrays

Three months after securing a patent for MapReduce - the distributed number-crunching platform that underpins its world-spanning infrastructure - Google has granted a license to Apache Hadoop, easing infringement concerns hovering over the MapReduce-mimicking open source project.

Apache legal counsel Lawrence Rosen announced the news with an email to the Apache board late last week, and his note was soon posted to a public mailing list.

Rosen did not immediately respond to a request for comment. Nor did Google.

In mid-January, Google won a US patent for "a system and method for efficient large-scale data processing." The patent - which you can see here - describes a means of splitting data-crunching tasks into tiny sub-tasks and mapping them across distributed machines, before reducing the results into one master calculation.

Google uses this MapReduce setup to crunch data across the massive distributed infrastructure buttressing its online services, and though the platform is proprietary, the company published a research paper describing its basic setup in December 2004.

This paper - along with a sister paper describing the company's GFS distributed file system - became the basis for Hadoop. The platform was originally developed by Doug Cutting to back his Nutch open source web crawler, and it was eventually open sourced at Apache. Famously, it's named for a yellow stuffed elephant that belongs to Cutting's son.

When Google won its patent, the general assumption among the Hadoop community was that it posed no threat to the open source project. "Google has lots of patents, and it basically has no track record of using those patents offensively, either involving licensing or pursuing people for infringement," said Mike Olson, chief executive of Cloudera, a company that has commercialized Hadoop in Red Hat-like fashion.

Then he pointed out that Google is a member of the Open Invention Network patent pool, which grants licenses for patented technology in an effort to promote Linux. "All of this convinces us that this is a strategic move from Google and not something that is aimed at the head of any Hadoop adopter or satellite company - Cloudera included."

What's more, Google has long used Hadoop as a way of exposing potential hires to its "Big Data" ways. And even if the company did take legal action, you have to wonder how well its patent would hold up. The map and reduce functions described by Google have been a part of parallel programming for decades.

But now, Mountain View has officially eased fears of legal action. Rosen writes in the email: "Several weeks ago I sought clarification from Google about its recent patent 7,650,331 ["System and method for efficient large-scale data processing"] that may be infringed by implementation of the Apache Hadoop and Apache MapReduce projects. I just received word from Google's general counsel that 'we have granted a license for Hadoop, terms of which are specified in the CLA [contributor licensing agreement].'"

It's unclear what the terms are. But they seem to have passed muster. "I am very pleased to reassure the Apache community about Google's continued generosity and commitment to ASF and open source", Rosen continued.

This is good news not only the likes of Cloudera, but for many of the industry's biggest names as well. The open source Hadoop now underpins everything from Facebook and Yahoo! to, believe it or not, portions of Microsoft Bing. ®

Providing a secure and efficient Helpdesk

More from The Register

next story
Google+ goes TITSUP. But WHO knew? How long? Anyone ... Hello ...
Wobbly Gmail, Contacts, Calendar on the other hand ...
UNIX greybeards threaten Debian fork over systemd plan
'Veteran Unix Admins' fear desktop emphasis is betraying open source
Preview redux: Microsoft ships new Windows 10 build with 7,000 changes
Latest bleeding-edge bits borrow Action Center from Windows Phone
Microsoft promises Windows 10 will mean two-factor auth for all
Sneak peek at security features Redmond's baking into new OS
Netscape Navigator - the browser that started it all - turns 20
It was 20 years ago today, Marc Andreeesen taught the band to play
DEATH by PowerPoint: Microsoft warns of 0-day attack hidden in slides
Might put out patch in update, might chuck it out sooner
Redmond top man Satya Nadella: 'Microsoft LOVES Linux'
Open-source 'love' fairly runneth over at cloud event
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
Three 1TB solid state scorchers up for grabs
Big SSDs can be expensive but think big and think free because you could be the lucky winner of one of three 1TB Samsung SSD 840 EVO drives that we’re giving away worth over £300 apiece.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.