Feeds

'Grid computing Red Hat' out-Amazons Amazon

Cloudera in the, yes, cloud

Combat fraud and increase customer satisfaction

Hadoop Summit In its mission to bring to world+dog the joys of Hadoop - that open-source grid-computing platform based on Google arrogance - Cloudera has out-Amazoned Amazon.

Today, the star-studded Hadoop startup told the world that its commercial stuffed-elephant distro can now be run on Amazon's Elastic Compute Cloud (EC2) in tandem with so-called Elastic Block Store (EBS) storage volumes. EBS volumes are mounted directly onto EC2 server instances.

This means you can run ongoing Hadoop jobs - starting them and stopping them whenever you like - without moving data back and forth between the local EC2 disks and Amazon's Simple Storage Sevice (S3). "Instead of using local disks, you can use EBS volumes," Cloudera man Christophe Bisciglia said today at the annual Hadoop Summit in Santa Clara, California.

"What's key about this is that your data is persistent. Currently, if you bring up a Hadoop cluster on Amazon and then bring it down, your [Hadoop File System] instance goes away. S3 can mitigate this, but then you have to round-trip between S3 and Hadoop every time you run a job.

"This is a way to turn your clusters on and off and keep them persistent and bring the full power of Hadoop."

Cloudera also says that its EBS integration improves Hadoop performance on the Amazon cloud by allowing more disks per server. EC2 provides a limited number of local disks for each instance.

Named for a yellow stuffed elephant, Hadoop mimics Google's MapReduce framework, mapping epic data-crunching tasks across a sea of machines - i.e. splitting them into tiny sub-tasks - before reducing the results into one master calculation. You can run it your own data centers - as Yahoo!, Facebook, and many others do - or you could run on Amazon's cloud. Or, for that matter, another infrastructure cloud.

Amazon's cloud offers its own Hadoop implementation as a service. It's called Amazon Elastic MapReduce. But it doesn't dovetail with EBS.

Bisciglia called Cloudera's EBS integration "a beta."

Cloudera also announced that its commercial distro - think of Cloudera as Hadoop's Red Hat - now includes the latest versions of Hive and Pig, two languages for coding atop Hadoop. The distro now includes Hive 0.3 and Pig 0.2. The distro is available at clouder.com/hadoop.

And the company has released beta packages of Hadoop version 0.20. "Twenty is going to be a really important release - it's going to include both sets of APIs, both the new and the old ones," Bisciglia said. ®

High performance access to file storage

More from The Register

next story
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Batten down the hatches, Ubuntu 14.04 LTS due in TWO DAYS
Admins dab straining server brows in advance of Trusty Tahr's long-term support landing
Inside the Hekaton: SQL Server 2014's database engine deconstructed
Nadella's database sqares the circle of cheap memory vs speed
Microsoft lobs pre-release Windows Phone 8.1 at devs who dare
App makers can load it before anyone else, but if they do they're stuck with it
Half of Twitter's 'active users' are SILENT STALKERS
Nearly 50% have NEVER tweeted a word
Oh no, Joe: WinPhone users already griping over 8.1 mega-update
Hang on. Which bit of Developer Preview don't you understand?
Internet-of-stuff startup dumps NoSQL for ... SQL?
NoSQL taste great at first but lacks proper nutrients, says startup cloud whiz
Windows 8.1, which you probably haven't upgraded to yet, ALREADY OBSOLETE
Pre-Update versions of new Windows version will no longer support patches
IRS boss on XP migration: 'Classic fix the airplane while you're flying it attempt'
Plus: Condoleezza Rice at Dropbox 'maybe she can find ... weapons of mass destruction'
prev story

Whitepapers

Designing a defence for mobile apps
In this whitepaper learn the various considerations for defending mobile applications; from the mobile application architecture itself to the myriad testing technologies needed to properly assess mobile applications risk.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.