Feeds

Doug Cutting: Hadoop dodged a Microsoft-Oracle stomping

Elephant daddy on breaking into mainstream IT

3 Big data security analytics techniques

Interview We’ve all heard plenty about open source changing the dynamics of the tech industry and upsetting the old order. Open source, we’re told, is manifest destiny. Companies that ignore it will be consigned to history and CIOs who assert there’s no freebie code behind their firewalls are out of touch with devs happily humming to Tomcat, Apache, Linux and PHP. At least that's how the story goes.

One open-source success of recent years has been Hadoop, the Apache-licensed implementation of Google’s MapReduce, which quickly and efficiently processes petabytes of data using clusters of ordinary x86 servers.

MapReduce works by splitting up massive data-processing jobs into chunks to be processed locally and parallelising the computation (the map phase), and then re-combining the results (the reduce phase) at the end. It means you don’t need big, centralised servers like mainframes or SPARC servers; it's a gift to x86 computing.

In the eight years since Hadoop was first written by Doug Cutting and Mike Cafarella, it has found a home running Amazon.com, Facebook and Yahoo! – some of the biggest sites on the web – among others. In the last eight months alone, Hadoop has won the backing of Microsoft, IBM and Oracle – three of the biggest names in relational databases. The software giants are now supporting a piece of software that the NoSQL zealots believed would actually kill RDBMS.

Microsoft is writing connectors between its databases, Windows and Azure cloud and Hadoop with Hadoop start-up Hortonworks, while Oracle is marrying Hadoop with its open-source MySQL database and merging the result with some Sun Microsystems' server hardware to produce yet another Oracle appliance.

'I’m pleased they decided not to fight it [Hadoop] with some proprietary solution, but to join forces with the open source one' – Doug Cutting

Yet, with history apparently on his side, you'd be surprised to learn Cutting feared Oracle and Microsoft might try to stand up to Hadoop, with disastrous consequences for the ecosystem. He said he is "gratified" that the pair decided to come on board.

“I’m pleased they decided not to fight it [Hadoop] with some proprietary solution, but to join forces with the open source one,” Cutting told The Reg. “It means those would otherwise be two potential sources of serious competition and to grow the community with two companies as big and powerful as Oracle and Microsoft is tremendous.

“I’m really gratified they have elected not to [develop proprietary solutions]. It’s a good thing for Hadoop for sure... I no longer see a formidable competitor, which is a little frightening," Cutting said.

“[It’s] frightening and exciting at the same time because it’s something you have to worry about, to win them over and convince them that this is a better approach. It’s gratifying when you haven’t got to do that.”

Microsoft and Oracle could have forked Hadoop’s code, building versions of Hadoop tailored to their systems thereby splitting the community into those who support Hadoop for large communities of Oracle and SQL Server users, and everybody else.

Not possible you say? Oracle has played politics before to get its way – with disastrous results for open-source projects. Oracle pulled the open-source Solaris project, OpenSolaris, back in-house in 2010 – allowing the fledgling open-source effort that had been blessed and spun up by Sun to die. Oracle’s control of OpenOffice has produced the LibreOffice fork in 2011, while Oracle's reluctance to let go of the Hudson build management system saw almost the entire community leave to create the rival Jenkins.

The legacy of such actions: forked codebases and rival claims over which is the one "true" project. Oracle has the brands, but the community has the code.

Then there’s Microsoft. Redmond is a strategic friend to open source, supporting projects where they help sell more copies of Windows or at prevent lost sales. So far it has worked on Linux, MySQL, PHP, and cuddled up to Eclipse on Silverlight.

On big data, Microsoft had been building a Hadoop-esque architecture since 2006. Called Dryad, it would “efficiently” process huge data loads running on Windows HPC Server 2008 R2 and HPC Pack 2008 R2-based clusters with Service Pack 2. In November last year, however, Microsoft quietly announced that it no longer planned to pursue Dryad as a commercial product just as it announced Hadoop connectors to SQL Server and Windows Azure.

Microsoft and Oracle have muscle in RDBMS. Oracle sells half the planet’s relational databases in a market worth $29bn with Microsoft in third place. The more open-source friendly IBM – which announced Hadoop connectors to its DB2 database around the same time as Oracle – is second.

It's not just Cutting, the father of Hadoop, who felt concern. Hortonworks, the start-up that spun out of Yahoo! last year with venture backing from Red Hat and JBoss investor Rob Bearden and competes with Cutting’s Cloudera, was also worried by what Microsoft and Oracle might do.

A thousand tiny elephants

Eric Baldeschwieler, Hortonworks’ chief technology officer, breathed what could be called a sigh of relief when Oracle last year announced its plans for a big data appliance using Hadoop. “It’s hugely validating of Hadoop, having all the major vendors coming in,” Baldeschwieler told The Reg at the time. “What we don’t want to see is thousands of flavours of Hadoop.”

Why did the giants suddenly turn friendly towards a technology that Cutting reckons will tread on the toes of their beloved RDBMS in about five years – when, as Cutting believes, it becomes an incumbent of mainstream enterprise IT?

“We are moving into a world where there’s lots of data,” Cutting says. “It [Hadoop] is not going to take over all software – there will be other technologies – but it’s going to become one of the mainstream staples in the next five to 10 years - maybe even sooner than that. It seems to be progressing pretty quickly.”

SANS - Survey on application security programs

Next page: RDBMS grows up

More from The Register

next story
OpenBSD founder wants to bin buggy OpenSSL library, launches fork
One Heartbleed vuln was too many for Theo de Raadt
Got Windows 8.1 Update yet? Get ready for YET ANOTHER ONE – rumor
Leaker claims big release due this fall as Microsoft herds us into the CLOUD
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Ubuntu 14.04 LTS: Great changes, but sssh don't mention the...
Why HELLO Amazon! You weren't here last time
Patch iOS, OS X now: PDFs, JPEGs, URLs, web pages can pwn your kit
Plus: iThings and desktops at risk of NEW SSL attack flaw
Next Windows obsolescence panic is 450 days from … NOW!
The clock is ticking louder for Windows Server 2003 R2 users
Batten down the hatches, Ubuntu 14.04 LTS due in TWO DAYS
Admins dab straining server brows in advance of Trusty Tahr's long-term support landing
Red Hat to ship RHEL 7 release candidate with a taste of container tech
Grab 'near-final' version of next Enterprise Linux next week
Apple inaugurates free OS X beta program for world+dog
Prerelease software now open to anyone, not just developers – as long as you keep quiet
prev story

Whitepapers

Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Mainstay ROI - Does application security pay?
In this whitepaper learn how you and your enterprise might benefit from better software security.
Combat fraud and increase customer satisfaction
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.