Feeds

Hadoop goes 'open core' with Cloudera Enterprise

All-star startup fattens stuffed elephant

High performance access to file storage

Hadoop Summit Cloudera – the commercial Hadoop outfit – has unveiled its first for-pay product: Cloudera Enterprise, an augmented version of the open source distributed data crunching platform designed specifically for production environments.

Cloudera Enterprise – announced today at the Hadoop Summit in Santa Clara, California – beefs up Hadoop with several proprietary management, monitoring, and administration tools, and it's sold on a subscription basis, priced according to the size of your Hadoop cluster. The all-star Silicon Valley startup has adopted an "open core" model, enhancing an open source Hadoop core with additional software that carries a price tag.

"We've been in the market with customers for coming up on two years now, supporting Hadoop in real enterprise production environments," Cloudera CEO Mike Olson, tells The Register. "We've learned a lot about how customers use [Hadoop], what it does well, and what makes it difficult to deploy and operate. As a result of all of that activity."

Additional proprietary tools include integration with LDAP directory servers for user authentication and access control; dashboards for controlling and managing the flow of data into Hadoop clusters; and user interface for cluster management and administration. Buyers also receive maintenance update and support.

The heart of this new enterprise product is the company's open source Hadoop distro – Cloudera’s Distribution for Hadoop (CDH) – which just graduated to version 3. Also announced today, the CDH consists of Apache Hadoop and eight additional open sources projects.

This includes Hive (a SQL-like query language developed at Facebook), Pig (a lower-level language developed by Yahoo!), HBase (a distributed database developed by the now Microsoft-owned Powerset), Sqoop (a MySQL connector built by Cloudera), Oozie (the Hadoop workflow system), and Zookeeper (means of juggling distributed services from a central location), as well as two new projects just opened up by Cloudera under an Apache license: Flume and Hue.

Flume is Cloudera's data loading infrastructure, while HUE – short for Hadoop User Interface – is the web-based Hadoop GUI formerly known as the Cloudera Desktop. HUE provides a graphical user interface for creating and submitting jobs on a Hadoop cluster, monitoring the cluster's health, and browsing stored data. Typically, clusters are managed via the command line.

Based on Google’s proprietary software infrastructure, Hadoop is a means of crunching epic amounts of data across a network of distributed machines. Named for the yellow stuffed elephant belonging to the son of project founder Doug Cutting, the platform underpins online services operated by everyone from Yahoo! and Facebook to Microsoft.

Hadoop mimics GFS, Google's distributed file system, and MapReduce, Mountain View's distributed number-crunching platform. In 2004, Google published a pair of research papers on these infrastructure technologies, and Doug Cutting used the papers to build a platform that would back Nutch, his open source web crawler. Hadoop was open sourced at Apache, and it was bootstrapped by Yahoo!, which hired Cutting in 2006, before he left for Cloudera.

The platform consists of the Hadoop File System (HDFS) and Hadoop MapReduce.

Previously, Cloudera's only proprietary product was the free Cloudera Desktop, which has now been open sourced as HUE. The company offered its own Hadoop distro and various other open source tools in tandem with support, training, and certification services. But Olson and company say they've long been planning to add a subscription revenue stream.

"For the first time we're able to go to market with the stance that if you're using just HDFS and MapReduce, you're not getting the volume you should be out of Hadoop," says company co-founder Cloudera Jeff Hammerbacher, who worked on Hadoop at Facebook. "At Facebook, HDFS and MapReduce provided an excellent starting point for building infrastructure to management and extract value from datasets, but we had a large variety of tools surrounding those two." ®

Combat fraud and increase customer satisfaction

More from The Register

next story
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Inside the Hekaton: SQL Server 2014's database engine deconstructed
Nadella's database sqares the circle of cheap memory vs speed
Oh no, Joe: WinPhone users already griping over 8.1 mega-update
Hang on. Which bit of Developer Preview don't you understand?
Microsoft lobs pre-release Windows Phone 8.1 at devs who dare
App makers can load it before anyone else, but if they do they're stuck with it
Half of Twitter's 'active users' are SILENT STALKERS
Nearly 50% have NEVER tweeted a word
Internet-of-stuff startup dumps NoSQL for ... SQL?
NoSQL taste great at first but lacks proper nutrients, says startup cloud whiz
IRS boss on XP migration: 'Classic fix the airplane while you're flying it attempt'
Plus: Condoleezza Rice at Dropbox 'maybe she can find ... weapons of mass destruction'
Ditch the sync, paddle in the Streem: Upstart offers syncless sharing
Upload, delete and carry on sharing afterwards?
New Facebook phone app allows you to stalk your mates
Nearby Friends feature goes live in a few weeks
prev story

Whitepapers

Top three mobile application threats
Learn about three of the top mobile application security threats facing businesses today and recommendations on how to mitigate the risk.
Combat fraud and increase customer satisfaction
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
SANS - Survey on application security programs
In this whitepaper learn about the state of application security programs and practices of 488 surveyed respondents, and discover how mature and effective these programs are.