Hadoop distie MapR trousers another $30m to take on big data rivals

Working towards that eventual IPO, if it isn't eaten first

Application security programs and practises

MapR Technologies, one of the commercializers of the Hadoop big data muncher, has pocketed another $30m to help it ramp up its business and keep it on track for what the company hopes will be an initial public offering..

While Cloudera was out of the gate early commercializing the Hadoop big data muncher, MapR was close behind (by a matter of weeks) and no Hadoop distie has yet emerged as the inevitable Red Hat for fat and fast data.

There are plenty of other contenders, all of them doing interesting things to and with Hadoop, including (in no certain order) MapR, the Hortonworks direct spinout from Yahoo!, the spinning-out Pivotal unit of EMC, IBM (which has sold its own BigInsights variant of Hadoop for a few years) and now Intel, which has just announced its own Hadoop distro.

What is amazing is that Yahoo! spun Hortonworks out in the first place instead of leveraging it as a strategic asset, and that software-hungry Hewlett-Packard and Dell have not snapped up Cloudera or MapR to build out their software portfolios.

Every day that passes, these companies get more and more expensive, to the point where both must be tempted to either give up on owning their own distributions or grab the various Apache components and start up one of their own.

With the big data market (which means subscription support for open source components plus licensing for proprietary software extensions and the hardware to run it) expected to reach $5bn in revenues by 2016 or so, there would seem to be plenty of room for multiple contenders. Markets have tended in the past to create a few dominant players, and while MapR wants to be one of them in the big data world.

But with the advent of cloud platform services like Amazon Web Services' Elastic MapReduce, Google's BigQuery, or the eponymous service from Splunk, many companies may simply never install their own big data software. And still others with the technical resources may decide that Hadoop is strategic enough of an infrastructure/application layer that they build their own competence.

And so it is not a foregone conclusion at this point in the big data game that Hadoop will precisely track the history of the Linux operating system or that a dominant player like Red Hat will emerge. The market could remain highly fragmented.

None of the Hadoop disties want to think about that possibility, and they certainly want to be able to leverage what must be some pretty high multiples to either go public or sell out to the tier one IT system suppliers who are desperate to build up their software and services businesses.

"We've got a management team that is not looking for a quick exit," Jack Norris, vice president of TKTK, tells El Reg. "This is a paradigm shift, this is a new architecture. We are focused on an IPO, and John has the Splunk IPO on his desk and he looks at it often. We think we have an even bigger opportunity." Norris was referring to John Schroeder, [co-founder and CEO of MapR.

MapR's equity backers think it has a bigger opportunity than Splunk, too. In the first two rounds of funding from Lightspeed Venture Partners, Redpoint Ventures, and NEA, MapR was able to raise $29m and get several generations of Hadoop distributions into the field. The company, being privately held, does not provide revenue figures or customer counts, but has grown to 150 employees. The company's second round helped MapR open offices in London and Munich as part of its expansion in Europe.

This time around with the $30m in Series C funding, Mayfield Fund is leading the investment (with all three other equity players kicking in more dough), and Norris says the plan is to use it to expand into Asia while at the same time boosting its research and development to extend the MapR Hadoop stack.

The current M7 Hadoop distro marries MapR's innovative file system, which makes the Hadoop Distributed File System (HDFS) look like NFS to applications, with the HBase data warehousing layer for HDFS to significantly speed up SQL-like queries on Hadoop clusters.

That HBase speedup debuted back in October 2012, and it basically pushes HDFS down into its distributed NFS file system, and shards both data chunks and portions of HBase tables and spreads them around the cluster for performance but presents then as unified data and tables for applications.

MapR is very keen on its Apache Drill add-on for Hadoop, which is trying to bring realtime, interactive querying akin to what we have had for relational databases for decades to the Hadoop stack. Just as HBase sort of clones Google's BigTable overlay for its Google File System, Drill mimmicks Google's Dremel query tool, which uses an SQL-alike language called DrQL. Both Drill and the Google BigQuery service support DrQL.

All of the Hadoop disties are, of course, chasing the same dream. Cloudera has its Project Impala layer for HDFS to replace the Hive SQL-alike query language for HBase, and EMC's Pivotal group spinoff announced last week has taken the SQL guts out of the Greenplum parallel database and woven it into HDFS to create Project Hawq, which speaks actual SQL to sort through data stored in HDFS.

MapR is still the only Hadoop distie that can make HDFS speak NFS, but all of the big players are working on something that tries to make HDFS speak SQL, the default query language for relational databases, in one degree or another.

The investment by Mayfield Fund is not a particularly good indicator if MapR will end up being sold or will actually make a debut on Wall Street. The venture capital firm, established in 1969, has invested in over 500 companies. Of these, more than 100 have been sold off in mergers or acquisitions and more than 100 have gone public. ®

Eight steps to building an HP BladeSystem

More from The Register

next story
Sysadmin Day 2014: Quick, there's still time to get the beers in
He walked over the broken glass, killed the thugs... and er... reconnected the cables*
SHOCK and AWS: The fall of Amazon's deflationary cloud
Just as Jeff Bezos did to books and CDs, Amazon's rivals are now doing to it
Amazon Reveals One Weird Trick: A Loss On Almost $20bn In Sales
Investors really hate it: Share price plunge as growth SLOWS in key AWS division
US judge: YES, cops or feds so can slurp an ENTIRE Gmail account
Crooks don't have folders labelled 'drug records', opines NY beak
Auntie remains MYSTIFIED by that weekend BBC iPlayer and website outage
Still doing 'forensics' on the caching layer – Beeb digi wonk
Manic malware Mayhem spreads through Linux, FreeBSD web servers
And how Google could cripple infection rate in a second
BlackBerry: Toss the server, mate... BES is in the CLOUD now
BlackBerry Enterprise Services takes aim at SMEs - but there's a catch
The triumph of VVOL: Everyone's jumping into bed with VMware
'Bandwagon'? Yes, we're on it and so what, say big dogs
prev story


Top three mobile application threats
Prevent sensitive data leakage over insecure channels or stolen mobile devices.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Designing a Defense for Mobile Applications
Learn about the various considerations for defending mobile applications - from the application architecture itself to the myriad testing technologies.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.