Feeds

Linux lessons for Hadoop doubters

Before IBM there was Linus

Security for virtualized datacentres

Open ... and Shut While Hadoop is all the rage in the technology media today, it has barely scratched the surface of enterprise adoption. In fact, if anything, we are still only on the first few steps of the Big Data marathon, a race that Hadoop seems set to win despite its many shortcomings.

The big question will be whether the market will keep the Hadoop faith as these shortcomings are resolved. All indications suggest that it will.

As The Wall Street Journal recently highlighted, the pace of adoption of given technologies has accelerated in the past few years. Even so, as The Atlantic's Alex Madrigal points out, it actually takes a long time for new technologies to catch on, even in today's fast-paced environment. And, importantly: "In many cases, more time was spent going from zero to one percent [market] penetration than from one to 50."

In Hadoop Land, we're still in the transition from zero per cent adoption to one per cent adoption.

Part of the reason is Hadoop's own shortcomings. Despite being a big proponent of Hadoop, IBM points to a few specific deficiencies in Hadoop that hold it back, including a lack of performance and scalability, inflexible resource management, and a limitation to a single distributed file system instead of multiple data source support.

IBM, of course, promises to resolve these issues with its proprietary complements to Hadoop, and it is not alone among the relational database vendors in trying to shame Hadoop for being a poor RDBMS. Still, it's not wrong that Hadoop has significant problems.

One of the biggest is that Hadoop is batch oriented in a world increasingly run in real-time. Loggly and Webtrends have both been quick to call out this void, but I'm not an unbiased observer, either. After all, my own company, Nodeable, was established to add real-time capabilities to Hadoop.

So lots of vendors want to fix Hadoop's problems. Meanwhile, customers are buying big into Hadoop.

Mike Olson, chief executive of the biggest standalone Hadoop vendor, Cloudera, in an email to me called the attempt to sully Hadoop's reputation "desperation FUD. He cited Cloudera's traction with customers and partners. He's right, but given how early we are in the Hadoop adoption curve, it's still possible that other alternatives, like Percolator, will claim the Hadoop crown.

Possible, but not very likely.

This isn't, after all, consumer technology, which changes with the wind. Instagram went from zero to 50 million users in a little over a year, but enterprise technology adoption simply doesn't work that way.

Back in 2000 IBM announced that it was going to invest $1bn in advancing the Linux operating system. This was big news for those of us that supported Linux distributions back then, but it came roughly 10 years after Linus Torvalds released the first Linux source code, and it took another 10 years before Linux really came to dominate the industry.

Today we take it for granted that startups, clouds and other new ventures will default to Linux as their operating system, but for years after IBM's investment IT departments still chafed at putting Linux in their data centres.

Once the momentum got rolling behind Linux, though, there really was no going back. Microsoft tried to FUD Linux into submission, but there was simply too much industry adoption of open-source Linux to halt it.

The same seems true of Hadoop today. Yes, it has problems, just as Linux did back in 1991, or even 2001. But Hadoop also has a community around it that took years for Linux to gather. IBM, Oracle, Microsoft, Cloudera, Hortonworks, Yahoo!, Intel, NetApp, Facebook, Cisco, and more are all behind Hadoop in a big way.

And so are customers. Once Hadoop goes into their data centres, IT departments are simply not going to rip and replace Hadoop with the next shiny Big Data object. Not until the industry as a whole shoves them there, because the enterprise hunts in packs, and the "pack" is currently firmly behind Hadoop.

All of which is why I think we're going to see Hadoop produce the next Oracle-sized database company. We're also likely to see such a company emerge from the NoSQL ranks, but Hadoop is a near certain bet right now. Cloudera currently has the lead, but again, we're just starting the marathon, one that will produce cost savings for customers, fat bank balances for vendors, and several big exits for venture capitalists.

Hadoop, in short, is a gift that will keep giving for many years to come. It's not guaranteed, but it's about as close to a guarantee as the tech industry has. ®

Matt Asay is senior vice president of business development at Nodeable, offering systems management for managing and analysing cloud-based data. He was formerly SVP of biz dev at HTML5 start-up Strobe and chief operating officer of Ubuntu commercial operation Canonical. With more than a decade spent in open source, Asay served as Alfresco's general manager for the Americas and vice president of business development, and he helped put Novell on its open source track. Asay is an emeritus board member of the Open Source Initiative (OSI). His column, Open...and Shut, appears three times a week on The Register.

Secure remote control for conventional and virtual desktops

More from The Register

next story
Phones 4u slips into administration after EE cuts ties with Brit mobe retailer
More than 5,500 jobs could be axed if rescue mission fails
Israeli spies rebel over mass-snooping on innocent Palestinians
'Disciplinary treatment will be sharp and clear' vow spy-chiefs
Apple CEO Tim Cook: TV is TERRIBLE and stuck in the 1970s
The iKing thinks telly is far too fiddly and ugly – basically, iTunes
Huawei ditches new Windows Phone mobe plans, blames poor sales
Giganto mobe firm slams door shut on Microsoft. OH DEAR
Phones 4u website DIES as wounded mobe retailer struggles to stay above water
Founder blames 'ruthless network partners' for implosion
Found inside ISIS terror chap's laptop: CELINE DION tunes
REPORT: Stash of terrorist material found in Syria Dell box
Show us your Five-Eyes SECRETS says Privacy International
Refusal to disclose GCHQ canteen menus and prices triggers Euro Human Rights Court action
prev story

Whitepapers

Secure remote control for conventional and virtual desktops
Balancing user privacy and privileged access, in accordance with compliance frameworks and legislation. Evaluating any potential remote control choice.
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.
Providing a secure and efficient Helpdesk
A single remote control platform for user support is be key to providing an efficient helpdesk. Retain full control over the way in which screen and keystroke data is transmitted.