Feeds

Linux lessons for Hadoop doubters

Before IBM there was Linus

Intelligent flash storage arrays

Open ... and Shut While Hadoop is all the rage in the technology media today, it has barely scratched the surface of enterprise adoption. In fact, if anything, we are still only on the first few steps of the Big Data marathon, a race that Hadoop seems set to win despite its many shortcomings.

The big question will be whether the market will keep the Hadoop faith as these shortcomings are resolved. All indications suggest that it will.

As The Wall Street Journal recently highlighted, the pace of adoption of given technologies has accelerated in the past few years. Even so, as The Atlantic's Alex Madrigal points out, it actually takes a long time for new technologies to catch on, even in today's fast-paced environment. And, importantly: "In many cases, more time was spent going from zero to one percent [market] penetration than from one to 50."

In Hadoop Land, we're still in the transition from zero per cent adoption to one per cent adoption.

Part of the reason is Hadoop's own shortcomings. Despite being a big proponent of Hadoop, IBM points to a few specific deficiencies in Hadoop that hold it back, including a lack of performance and scalability, inflexible resource management, and a limitation to a single distributed file system instead of multiple data source support.

IBM, of course, promises to resolve these issues with its proprietary complements to Hadoop, and it is not alone among the relational database vendors in trying to shame Hadoop for being a poor RDBMS. Still, it's not wrong that Hadoop has significant problems.

One of the biggest is that Hadoop is batch oriented in a world increasingly run in real-time. Loggly and Webtrends have both been quick to call out this void, but I'm not an unbiased observer, either. After all, my own company, Nodeable, was established to add real-time capabilities to Hadoop.

So lots of vendors want to fix Hadoop's problems. Meanwhile, customers are buying big into Hadoop.

Mike Olson, chief executive of the biggest standalone Hadoop vendor, Cloudera, in an email to me called the attempt to sully Hadoop's reputation "desperation FUD. He cited Cloudera's traction with customers and partners. He's right, but given how early we are in the Hadoop adoption curve, it's still possible that other alternatives, like Percolator, will claim the Hadoop crown.

Possible, but not very likely.

This isn't, after all, consumer technology, which changes with the wind. Instagram went from zero to 50 million users in a little over a year, but enterprise technology adoption simply doesn't work that way.

Back in 2000 IBM announced that it was going to invest $1bn in advancing the Linux operating system. This was big news for those of us that supported Linux distributions back then, but it came roughly 10 years after Linus Torvalds released the first Linux source code, and it took another 10 years before Linux really came to dominate the industry.

Today we take it for granted that startups, clouds and other new ventures will default to Linux as their operating system, but for years after IBM's investment IT departments still chafed at putting Linux in their data centres.

Once the momentum got rolling behind Linux, though, there really was no going back. Microsoft tried to FUD Linux into submission, but there was simply too much industry adoption of open-source Linux to halt it.

The same seems true of Hadoop today. Yes, it has problems, just as Linux did back in 1991, or even 2001. But Hadoop also has a community around it that took years for Linux to gather. IBM, Oracle, Microsoft, Cloudera, Hortonworks, Yahoo!, Intel, NetApp, Facebook, Cisco, and more are all behind Hadoop in a big way.

And so are customers. Once Hadoop goes into their data centres, IT departments are simply not going to rip and replace Hadoop with the next shiny Big Data object. Not until the industry as a whole shoves them there, because the enterprise hunts in packs, and the "pack" is currently firmly behind Hadoop.

All of which is why I think we're going to see Hadoop produce the next Oracle-sized database company. We're also likely to see such a company emerge from the NoSQL ranks, but Hadoop is a near certain bet right now. Cloudera currently has the lead, but again, we're just starting the marathon, one that will produce cost savings for customers, fat bank balances for vendors, and several big exits for venture capitalists.

Hadoop, in short, is a gift that will keep giving for many years to come. It's not guaranteed, but it's about as close to a guarantee as the tech industry has. ®

Matt Asay is senior vice president of business development at Nodeable, offering systems management for managing and analysing cloud-based data. He was formerly SVP of biz dev at HTML5 start-up Strobe and chief operating officer of Ubuntu commercial operation Canonical. With more than a decade spent in open source, Asay served as Alfresco's general manager for the Americas and vice president of business development, and he helped put Novell on its open source track. Asay is an emeritus board member of the Open Source Initiative (OSI). His column, Open...and Shut, appears three times a week on The Register.

Internet Security Threat Report 2014

More from The Register

next story
Facebook pays INFINITELY MORE UK corp tax than in 2012
Thanks for the £3k, Zuck. Doh! you're IN CREDIT. Guess not
Facebook, Apple: LADIES! Why not FREEZE your EGGS? It's on the company!
No biological clockwatching when you work in Silicon Valley
Happiness economics is bollocks. Oh, UK.gov just adopted it? Er ...
Opportunity doesn't knock; it costs us instead
Sysadmin with EBOLA? Gartner's issued advice to debug your biz
Start hoarding cleaning supplies, analyst firm says, and assume your team will scatter
YARR! Pirates walk the plank: DMCA magnets sink in Google results
Spaffing copyrighted stuff over the web? No search ranking for you
Microsoft EU warns: If you have ties to the US, Feds can get your data
European corps can't afford to get complacent while American Big Biz battles Uncle Sam
Don't bother telling people if you lose their data, say Euro bods
You read that right – with the proviso that it's encrypted
prev story

Whitepapers

Cloud and hybrid-cloud data protection for VMware
Learn how quick and easy it is to configure backups and perform restores for VMware environments.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Three 1TB solid state scorchers up for grabs
Big SSDs can be expensive but think big and think free because you could be the lucky winner of one of three 1TB Samsung SSD 840 EVO drives that we’re giving away worth over £300 apiece.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.