Feeds

Doug Cutting: Hadoop dodged a Microsoft-Oracle stomping

Elephant daddy on breaking into mainstream IT

3 Big data security analytics techniques

Name change, game changer

The last big change went to the very heart of Hadoop’s identity: the MapReduce engine was rewritten so there is a global resource manager and per-application master that manages different applications. The rewrite was called MapReduce 2.0 (MRv2) or YARN, and has been folded into Apache Hadoop 2.0 and Cloudera’s CDH 4.

MRv2 is designed to bring the power of Hadoop to applications outside of large datasets, such as graph-processing algorithms. "Graph" is the name given to software maps constructed by social networks such as Facebook and LinkedIn to connect people, to find out who knows whom and their relationships to each other.

'Right now we are getting the low-hanging fruit of the companies that are sophisticated users of technology and that have the most glaring big data problems,' – Doug Cutting

“Graph will be moved to shared node,” Cutting said. “So you can have a node for a time doing graph process and then a minute later running a task from a MapReduce job – there’d be time-sharing on the cluster at a finer level. It’s going tog give people better resource utilisation.”

Cutting wrote Hadoop while working on Nutch, a web search project. Nutch had used a lot of manual steps and lacked a framework that would automate large-scale data crunching. MapReduce provided the automation framework and two years later Cutting was hired by Yahoo!.

Despite improvements, Hadoop still remains relatively difficult for newcomers. More work is needed on vertical-specific apps, nice UIs and tools to integrate Hadoop with existing platforms, Cutting says.

Cutting, who today spends just a third of his time working on Hadoop, is focused on the need for broader language support. He spends the rest of his time working on Apache's Avro project, designed to make Hadoop “less of a Java shop”.

“[Java] gets you 90 per cent of the performance,” Cutting told us. “Over time, the last 10 per cent of performance might become more critical and as things stabilise, we might see more things move into C/C++. In Hadoop, several critical elements have been moved into C/C++. It would be nice if wasn’t just a Java world and we worked better with other languages, but it [Java] has been a good platform for the technology.”

Cutting is bullish on Hadoop's potential to become established in the mainstream in the next five to 10 years. MRv2 should help broaden Hadoop’s adoption among different sizes of user with different types of challenges, he says. "Making Hadoop easier" should accompany such changes, he added.

The old order liveth

“Right now we are getting the low-hanging fruit of the companies that are sophisticated users of technology and that have the most glaring big data problems. Over time we can take new classes of problems and support people more easily and do it more efficiently,” Cutting says.

Cutting said he is relieved Microsoft and Oracle are on board. In fact, he says, the will actually help Hadoop grow by adding more contributors. “We want to build a large community to develop the common software... the larger the community the more refined the tools will get. Users turn into contributors over time; the more users, the more contributors over time.”

As long as Cloudera and Hortonworks don’t become Cold War proxies to the giants, maybe this IS manifest destiny. Maybe more makers of proprietary software in the field of data and analytics should be worried.

By allying with Hadoop, though, Microsoft and Oracle have done more than reassure Cutting. They have ensured the survival of their RDBMS. ®

SANS - Survey on application security programs

More from The Register

next story
Android engineer: We DIDN'T copy Apple OR follow Samsung's orders
Veep testifies for Samsung during Apple patent trial
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Batten down the hatches, Ubuntu 14.04 LTS due in TWO DAYS
Admins dab straining server brows in advance of Trusty Tahr's long-term support landing
Microsoft lobs pre-release Windows Phone 8.1 at devs who dare
App makers can load it before anyone else, but if they do they're stuck with it
Half of Twitter's 'active users' are SILENT STALKERS
Nearly 50% have NEVER tweeted a word
Internet-of-stuff startup dumps NoSQL for ... SQL?
NoSQL taste great at first but lacks proper nutrients, says startup cloud whiz
Windows 8.1, which you probably haven't upgraded to yet, ALREADY OBSOLETE
Pre-Update versions of new Windows version will no longer support patches
Microsoft TIER SMEAR changes app prices whether devs ask or not
Some go up, some go down, Redmond goes silent
Red Hat to ship RHEL 7 release candidate with a taste of container tech
Grab 'near-final' version of next Enterprise Linux next week
Ditch the sync, paddle in the Streem: Upstart offers syncless sharing
Upload, delete and carry on sharing afterwards?
prev story

Whitepapers

Designing a defence for mobile apps
In this whitepaper learn the various considerations for defending mobile applications; from the mobile application architecture itself to the myriad testing technologies needed to properly assess mobile applications risk.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.