Feeds

Linux lessons for Hadoop doubters

Before IBM there was Linus

High performance access to file storage

Open ... and Shut While Hadoop is all the rage in the technology media today, it has barely scratched the surface of enterprise adoption. In fact, if anything, we are still only on the first few steps of the Big Data marathon, a race that Hadoop seems set to win despite its many shortcomings.

The big question will be whether the market will keep the Hadoop faith as these shortcomings are resolved. All indications suggest that it will.

As The Wall Street Journal recently highlighted, the pace of adoption of given technologies has accelerated in the past few years. Even so, as The Atlantic's Alex Madrigal points out, it actually takes a long time for new technologies to catch on, even in today's fast-paced environment. And, importantly: "In many cases, more time was spent going from zero to one percent [market] penetration than from one to 50."

In Hadoop Land, we're still in the transition from zero per cent adoption to one per cent adoption.

Part of the reason is Hadoop's own shortcomings. Despite being a big proponent of Hadoop, IBM points to a few specific deficiencies in Hadoop that hold it back, including a lack of performance and scalability, inflexible resource management, and a limitation to a single distributed file system instead of multiple data source support.

IBM, of course, promises to resolve these issues with its proprietary complements to Hadoop, and it is not alone among the relational database vendors in trying to shame Hadoop for being a poor RDBMS. Still, it's not wrong that Hadoop has significant problems.

One of the biggest is that Hadoop is batch oriented in a world increasingly run in real-time. Loggly and Webtrends have both been quick to call out this void, but I'm not an unbiased observer, either. After all, my own company, Nodeable, was established to add real-time capabilities to Hadoop.

So lots of vendors want to fix Hadoop's problems. Meanwhile, customers are buying big into Hadoop.

Mike Olson, chief executive of the biggest standalone Hadoop vendor, Cloudera, in an email to me called the attempt to sully Hadoop's reputation "desperation FUD. He cited Cloudera's traction with customers and partners. He's right, but given how early we are in the Hadoop adoption curve, it's still possible that other alternatives, like Percolator, will claim the Hadoop crown.

Possible, but not very likely.

This isn't, after all, consumer technology, which changes with the wind. Instagram went from zero to 50 million users in a little over a year, but enterprise technology adoption simply doesn't work that way.

Back in 2000 IBM announced that it was going to invest $1bn in advancing the Linux operating system. This was big news for those of us that supported Linux distributions back then, but it came roughly 10 years after Linus Torvalds released the first Linux source code, and it took another 10 years before Linux really came to dominate the industry.

Today we take it for granted that startups, clouds and other new ventures will default to Linux as their operating system, but for years after IBM's investment IT departments still chafed at putting Linux in their data centres.

Once the momentum got rolling behind Linux, though, there really was no going back. Microsoft tried to FUD Linux into submission, but there was simply too much industry adoption of open-source Linux to halt it.

The same seems true of Hadoop today. Yes, it has problems, just as Linux did back in 1991, or even 2001. But Hadoop also has a community around it that took years for Linux to gather. IBM, Oracle, Microsoft, Cloudera, Hortonworks, Yahoo!, Intel, NetApp, Facebook, Cisco, and more are all behind Hadoop in a big way.

And so are customers. Once Hadoop goes into their data centres, IT departments are simply not going to rip and replace Hadoop with the next shiny Big Data object. Not until the industry as a whole shoves them there, because the enterprise hunts in packs, and the "pack" is currently firmly behind Hadoop.

All of which is why I think we're going to see Hadoop produce the next Oracle-sized database company. We're also likely to see such a company emerge from the NoSQL ranks, but Hadoop is a near certain bet right now. Cloudera currently has the lead, but again, we're just starting the marathon, one that will produce cost savings for customers, fat bank balances for vendors, and several big exits for venture capitalists.

Hadoop, in short, is a gift that will keep giving for many years to come. It's not guaranteed, but it's about as close to a guarantee as the tech industry has. ®

Matt Asay is senior vice president of business development at Nodeable, offering systems management for managing and analysing cloud-based data. He was formerly SVP of biz dev at HTML5 start-up Strobe and chief operating officer of Ubuntu commercial operation Canonical. With more than a decade spent in open source, Asay served as Alfresco's general manager for the Americas and vice president of business development, and he helped put Novell on its open source track. Asay is an emeritus board member of the Open Source Initiative (OSI). His column, Open...and Shut, appears three times a week on The Register.

High performance access to file storage

More from The Register

next story
Audio fans, prepare yourself for the Second Coming ... of Blu-ray
High Fidelity Pure Audio – is this what your ears have been waiting for?
MtGox chief Karpelès refuses to come to US for g-men's grilling
Bitcoin baron says he needs another lawyer for FinCEN chat
Dropbox defends fantastically badly timed Condoleezza Rice appointment
'Nothing is going to change with Dr. Rice's appointment,' file sharer promises
Did a date calculation bug just cost hard-up Co-op Bank £110m?
And just when Brit banking org needs £400m to stay afloat
Zucker punched: Google gobbles Facebook-wooed Titan Aerospace
Up, up and away in my beautiful balloon flying broadband-bot
Apple DOMINATES the Valley, rakes in more profit than Google, HP, Intel, Cisco COMBINED
Cook & Co. also pay more taxes than those four worthies PLUS eBay and Oracle
It may be ILLEGAL to run Heartbleed health checks – IT lawyer
Do the right thing, earn up to 10 years in clink
France bans managers from contacting workers outside business hours
«Email? Mais non ... il est plus tard que six heures du soir!»
prev story

Whitepapers

Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
HP ArcSight ESM solution helps Finansbank
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Mobile application security study
Download this report to see the alarming realities regarding the sheer number of applications vulnerable to attack, as well as the most common and easily addressable vulnerability errors.