Becoming Red Hat: Cloudera and Hortonworks' Big-Data death match

'In a world of tissue when you're Kleenex, you've won'

3 Big data security analytics techniques

Open ... and Shut In the Big Data market, Hadoop is clearly the team to beat. What is less clear is which of the Hadoop vendors will claim the spoils of that victory.

Because open source tends to be winner-take-all, we are almost certainly going to see a "Red Hat" of Hadoop, with the second place vendor left to clean up the crumbs.

As ever with open source, this means the Hadoop market ultimately comes down to a race for community support because, as Redmonk analyst Stephen O'Grady argues, the biggest community wins.

In community and other areas, Linux is a great analogue for Hadoop. I've suggested recently that Hadoop market observers could learn a lot from the indomitable rise of Linux, including from how it overcame technical shortcomings over time through communal development. But perhaps a more fundamental observation is that, as with Linux, there's no room for two major Hadoop vendors.

Yes, there will be truckloads of cash earned by EMC, IBM and others who use Hadoop as a complement to drive the sale of proprietary hardware and software, just as we have in the Linux market with IBM, Oracle, Hewlett-Packard and others.

But for those companies aspiring to be the Red Hat of Hadoop - that primary committer of code and provider of associated support services - there's only room for one such company, and it's Cloudera or Hortonworks. I don't feel MapR has the ability to move Hadoop development, given that it doesn't employ key Hadoop developers as Cloudera and Hortonworks do, so it has no chance of being a dominant Hadoop vendor.

Cash kings

Cloudera and Hortonworks recognise this, which is why both have raised mountains of cash. The size of the Big Data pie is huge, but it's not going to be split evenly. Only one company gets to be the center of the Hadoop ecosystem. Not two.

In enterprise Linux, that "one company" is Red Hat. SUSE (then Novell then just SUSE again) initially took Red Hat on and had a real chance to be the leader, but Red Hat persevered and became the billion-dollar open-source company while SUSE-Novell-SUSE did not.

Why did Red Hat win? Community.

No, not the kind of community we sometimes associate with open source, ie, individual hackers staying up late for the love of coding, though that demographic matters. Red Hat contributes more to the Linux kernel than any single individual or company.

This, in turn, led Red Hat to attract the second type of community: the "professional developer," or third-party application developer. Red Hat managed to amass an unassailable third-party application ecosystem lead. Ultimately, in the Hadoop battle the community to be won is this community of developers building around the Hadoop ecosystem, because it's this ecosystem that leads to customer adoption, which fuels revenues which fuel the hiring of more code committers.

Call it the virtuous cycle of commercial open-source community development.

From 2002 until 2005, I worked at Novell and after the SUSE acquisition saw first-hand how Red Hat used its third-party application ecosystem to crush SUSE. SUSE was always second choice with customers because the applications they wanted ran on Red Hat first, which in turn made SUSE second-best with partners, too. By the time Novell/SUSE finally caught up in terms of sheer number of applications (and now exceeds Red Hat), Red Hat had already cemented its brand and Novell's Linux business languished.

As Linux Foundation executive director Jim Zemlin is fond of saying: "In a world of tissue when you're Kleenex, you've won." When Red Hat became "Kleenex," the game was over.

In the Hadoop world, the race to be "Kleenex" is on, and it involves attracting the biggest ISV community. Between the two dominant Hadoop distributions, it's still a somewhat even race, even if Cloudera took the early lead with customer traction. Hortonworks has been playing up its open source purity, arguing that it's "true" open source while Cloudera offers a freemium/open core model. It's very similar to the argument that Red Hat used to use against Novell/SUSE.

But in this case, I don't think it applies.

Both Cloudera and Hortonworks contribute to and distribute 100 per cent open-source Hadoop platforms. The difference comes from the management and other tools each offers alongside Hadoop. Hortonworks believes even this area should be open source, which is why its rival to Cloudera Manager is open-source Ambari.

Winning advantage

The problem, however, is that Ambari isn't as mature as Cloudera Manager. In these early days of Hadoop adoption, customers and partners will skew toward the solution that works best, and that's currently Cloudera. For years, Red Hat's Network product and associated technology weren't open source, and no one cared. What they wanted was to be as productive as possible, as fast as possible.

Advantage: Cloudera.

Still, both companies are in a land grab for quality partners. Unlike in Linux land, there's not One Partner to Rule Them All, as Oracle was for Red Hat. Hortonworks has grabbed Microsoft and Informatica as partners (among others), while Cloudera has IBM and Oracle (among others). In terms of volume of partners, Cloudera has the lead with more than 300 partners (compared to Hortonwork's 62). Of course, Cloudera only lists 51 partners on its website, which suggests that maybe Hortonworks has more partners, too, but hasn't listed them.

Advantage: Cloudera (probably).

But let's get back to fundamentals. Who employs the most core committers to Hadoop, Cloudera or Hortonworks? After all, this tends to be the metric that helps fuel traction with third-party application developers. Unfortunately, there's not a clean answer. By one measure, Cloudera has a slight lead over Hortonworks:

But by Cloudera's own admission, there are multiple ways to measure the two companies' code contributions to Hadoop. Both companies employ several of Hadoop's heavy hitters.

Advantage: Unclear.

In short, it's too soon to call a winner. Cloudera has a two-year head start and significantly more revenue and general interest, at least as measured by Google searches. But the ultimate prize is reserved for the company that can amass the most meaningful application partners given that Hadoop, like Linux before it, is a platform play.

The platform with the biggest community wins. Every time. Who that winner will be in the case of Hadoop is still not clear. ®

Matt Asay is senior vice president of business development at Nodeable, offering systems management for managing and analysing cloud-based data. He was formerly SVP of biz dev at HTML5 start-up Strobe and chief operating officer of Ubuntu commercial operation Canonical. With more than a decade spent in open source, Asay served as Alfresco's general manager for the Americas and vice president of business development, and he helped put Novell on its open source track. Asay is an emeritus board member of the Open Source Initiative (OSI). His column, Open...and Shut, appears three times a week on The Register.

Mike Olson is on the board of Directors of Nodeable & CEO of Cloudera.

Top three mobile application threats

More from The Register

next story
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Inside the Hekaton: SQL Server 2014's database engine deconstructed
Nadella's database sqares the circle of cheap memory vs speed
Oh no, Joe: WinPhone users already griping over 8.1 mega-update
Hang on. Which bit of Developer Preview don't you understand?
Microsoft lobs pre-release Windows Phone 8.1 at devs who dare
App makers can load it before anyone else, but if they do they're stuck with it
Half of Twitter's 'active users' are SILENT STALKERS
Nearly 50% have NEVER tweeted a word
Internet-of-stuff startup dumps NoSQL for ... SQL?
NoSQL taste great at first but lacks proper nutrients, says startup cloud whiz
IRS boss on XP migration: 'Classic fix the airplane while you're flying it attempt'
Plus: Condoleezza Rice at Dropbox 'maybe she can find ... weapons of mass destruction'
Ditch the sync, paddle in the Streem: Upstart offers syncless sharing
Upload, delete and carry on sharing afterwards?
New Facebook phone app allows you to stalk your mates
Nearby Friends feature goes live in a few weeks
prev story


Top three mobile application threats
Learn about three of the top mobile application security threats facing businesses today and recommendations on how to mitigate the risk.
Combat fraud and increase customer satisfaction
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
SANS - Survey on application security programs
In this whitepaper learn about the state of application security programs and practices of 488 surveyed respondents, and discover how mature and effective these programs are.