Feeds

Doug Cutting: Hadoop dodged a Microsoft-Oracle stomping

Elephant daddy on breaking into mainstream IT

Boost IT visibility and business value

Interview We’ve all heard plenty about open source changing the dynamics of the tech industry and upsetting the old order. Open source, we’re told, is manifest destiny. Companies that ignore it will be consigned to history and CIOs who assert there’s no freebie code behind their firewalls are out of touch with devs happily humming to Tomcat, Apache, Linux and PHP. At least that's how the story goes.

One open-source success of recent years has been Hadoop, the Apache-licensed implementation of Google’s MapReduce, which quickly and efficiently processes petabytes of data using clusters of ordinary x86 servers.

MapReduce works by splitting up massive data-processing jobs into chunks to be processed locally and parallelising the computation (the map phase), and then re-combining the results (the reduce phase) at the end. It means you don’t need big, centralised servers like mainframes or SPARC servers; it's a gift to x86 computing.

In the eight years since Hadoop was first written by Doug Cutting and Mike Cafarella, it has found a home running Amazon.com, Facebook and Yahoo! – some of the biggest sites on the web – among others. In the last eight months alone, Hadoop has won the backing of Microsoft, IBM and Oracle – three of the biggest names in relational databases. The software giants are now supporting a piece of software that the NoSQL zealots believed would actually kill RDBMS.

Microsoft is writing connectors between its databases, Windows and Azure cloud and Hadoop with Hadoop start-up Hortonworks, while Oracle is marrying Hadoop with its open-source MySQL database and merging the result with some Sun Microsystems' server hardware to produce yet another Oracle appliance.

'I’m pleased they decided not to fight it [Hadoop] with some proprietary solution, but to join forces with the open source one' – Doug Cutting

Yet, with history apparently on his side, you'd be surprised to learn Cutting feared Oracle and Microsoft might try to stand up to Hadoop, with disastrous consequences for the ecosystem. He said he is "gratified" that the pair decided to come on board.

“I’m pleased they decided not to fight it [Hadoop] with some proprietary solution, but to join forces with the open source one,” Cutting told The Reg. “It means those would otherwise be two potential sources of serious competition and to grow the community with two companies as big and powerful as Oracle and Microsoft is tremendous.

“I’m really gratified they have elected not to [develop proprietary solutions]. It’s a good thing for Hadoop for sure... I no longer see a formidable competitor, which is a little frightening," Cutting said.

“[It’s] frightening and exciting at the same time because it’s something you have to worry about, to win them over and convince them that this is a better approach. It’s gratifying when you haven’t got to do that.”

Microsoft and Oracle could have forked Hadoop’s code, building versions of Hadoop tailored to their systems thereby splitting the community into those who support Hadoop for large communities of Oracle and SQL Server users, and everybody else.

Not possible you say? Oracle has played politics before to get its way – with disastrous results for open-source projects. Oracle pulled the open-source Solaris project, OpenSolaris, back in-house in 2010 – allowing the fledgling open-source effort that had been blessed and spun up by Sun to die. Oracle’s control of OpenOffice has produced the LibreOffice fork in 2011, while Oracle's reluctance to let go of the Hudson build management system saw almost the entire community leave to create the rival Jenkins.

The legacy of such actions: forked codebases and rival claims over which is the one "true" project. Oracle has the brands, but the community has the code.

Then there’s Microsoft. Redmond is a strategic friend to open source, supporting projects where they help sell more copies of Windows or at prevent lost sales. So far it has worked on Linux, MySQL, PHP, and cuddled up to Eclipse on Silverlight.

On big data, Microsoft had been building a Hadoop-esque architecture since 2006. Called Dryad, it would “efficiently” process huge data loads running on Windows HPC Server 2008 R2 and HPC Pack 2008 R2-based clusters with Service Pack 2. In November last year, however, Microsoft quietly announced that it no longer planned to pursue Dryad as a commercial product just as it announced Hadoop connectors to SQL Server and Windows Azure.

Microsoft and Oracle have muscle in RDBMS. Oracle sells half the planet’s relational databases in a market worth $29bn with Microsoft in third place. The more open-source friendly IBM – which announced Hadoop connectors to its DB2 database around the same time as Oracle – is second.

It's not just Cutting, the father of Hadoop, who felt concern. Hortonworks, the start-up that spun out of Yahoo! last year with venture backing from Red Hat and JBoss investor Rob Bearden and competes with Cutting’s Cloudera, was also worried by what Microsoft and Oracle might do.

A thousand tiny elephants

Eric Baldeschwieler, Hortonworks’ chief technology officer, breathed what could be called a sigh of relief when Oracle last year announced its plans for a big data appliance using Hadoop. “It’s hugely validating of Hadoop, having all the major vendors coming in,” Baldeschwieler told The Reg at the time. “What we don’t want to see is thousands of flavours of Hadoop.”

Why did the giants suddenly turn friendly towards a technology that Cutting reckons will tread on the toes of their beloved RDBMS in about five years – when, as Cutting believes, it becomes an incumbent of mainstream enterprise IT?

“We are moving into a world where there’s lots of data,” Cutting says. “It [Hadoop] is not going to take over all software – there will be other technologies – but it’s going to become one of the mainstream staples in the next five to 10 years - maybe even sooner than that. It seems to be progressing pretty quickly.”

Application security programs and practises

Next page: RDBMS grows up

More from The Register

next story
HIDDEN packet sniffer spy tech in MILLIONS of iPhones, iPads – expert
Don't panic though – Apple's backdoor is not wide open to all, guru tells us
Apple fanbois SCREAM as update BRICKS their Macbook Airs
Ragegasm spills over as firmware upgrade kills machines
Captain Kirk sets phaser to SLAUGHTER after trying new Facebook app
William Shatner less-than-impressed by Zuck's celebrity-only app
Do YOU work at Microsoft? Um. Are you SURE about that?
Nokia and marketing types first to get the bullet, says report
Microsoft takes on Chromebook with low-cost Windows laptops
Redmond's chief salesman: We're taking 'hard' decisions
Cheer up, Nokia fans. It can start making mobes again in 18 months
The real winner of the Nokia sale is *drumroll* ... Nokia
EU dons gloves, pokes Google's deals with Android mobe makers
El Reg cops a squint at investigatory letters
prev story

Whitepapers

Seven Steps to Software Security
Seven practical steps you can begin to take today to secure your applications and prevent the damages a successful cyber-attack can cause.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
Designing a Defense for Mobile Applications
Learn about the various considerations for defending mobile applications - from the application architecture itself to the myriad testing technologies.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.
Consolidation: the foundation for IT and business transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.