Feeds

Hadoop goes 'open core' with Cloudera Enterprise

All-star startup fattens stuffed elephant

SANS - Survey on application security programs

Hadoop Summit Cloudera – the commercial Hadoop outfit – has unveiled its first for-pay product: Cloudera Enterprise, an augmented version of the open source distributed data crunching platform designed specifically for production environments.

Cloudera Enterprise – announced today at the Hadoop Summit in Santa Clara, California – beefs up Hadoop with several proprietary management, monitoring, and administration tools, and it's sold on a subscription basis, priced according to the size of your Hadoop cluster. The all-star Silicon Valley startup has adopted an "open core" model, enhancing an open source Hadoop core with additional software that carries a price tag.

"We've been in the market with customers for coming up on two years now, supporting Hadoop in real enterprise production environments," Cloudera CEO Mike Olson, tells The Register. "We've learned a lot about how customers use [Hadoop], what it does well, and what makes it difficult to deploy and operate. As a result of all of that activity."

Additional proprietary tools include integration with LDAP directory servers for user authentication and access control; dashboards for controlling and managing the flow of data into Hadoop clusters; and user interface for cluster management and administration. Buyers also receive maintenance update and support.

The heart of this new enterprise product is the company's open source Hadoop distro – Cloudera’s Distribution for Hadoop (CDH) – which just graduated to version 3. Also announced today, the CDH consists of Apache Hadoop and eight additional open sources projects.

This includes Hive (a SQL-like query language developed at Facebook), Pig (a lower-level language developed by Yahoo!), HBase (a distributed database developed by the now Microsoft-owned Powerset), Sqoop (a MySQL connector built by Cloudera), Oozie (the Hadoop workflow system), and Zookeeper (means of juggling distributed services from a central location), as well as two new projects just opened up by Cloudera under an Apache license: Flume and Hue.

Flume is Cloudera's data loading infrastructure, while HUE – short for Hadoop User Interface – is the web-based Hadoop GUI formerly known as the Cloudera Desktop. HUE provides a graphical user interface for creating and submitting jobs on a Hadoop cluster, monitoring the cluster's health, and browsing stored data. Typically, clusters are managed via the command line.

Based on Google’s proprietary software infrastructure, Hadoop is a means of crunching epic amounts of data across a network of distributed machines. Named for the yellow stuffed elephant belonging to the son of project founder Doug Cutting, the platform underpins online services operated by everyone from Yahoo! and Facebook to Microsoft.

Hadoop mimics GFS, Google's distributed file system, and MapReduce, Mountain View's distributed number-crunching platform. In 2004, Google published a pair of research papers on these infrastructure technologies, and Doug Cutting used the papers to build a platform that would back Nutch, his open source web crawler. Hadoop was open sourced at Apache, and it was bootstrapped by Yahoo!, which hired Cutting in 2006, before he left for Cloudera.

The platform consists of the Hadoop File System (HDFS) and Hadoop MapReduce.

Previously, Cloudera's only proprietary product was the free Cloudera Desktop, which has now been open sourced as HUE. The company offered its own Hadoop distro and various other open source tools in tandem with support, training, and certification services. But Olson and company say they've long been planning to add a subscription revenue stream.

"For the first time we're able to go to market with the stance that if you're using just HDFS and MapReduce, you're not getting the volume you should be out of Hadoop," says company co-founder Cloudera Jeff Hammerbacher, who worked on Hadoop at Facebook. "At Facebook, HDFS and MapReduce provided an excellent starting point for building infrastructure to management and extract value from datasets, but we had a large variety of tools surrounding those two." ®

Top three mobile application threats

More from The Register

next story
Ubuntu 14.04 LTS: Great changes, but sssh don't mention the...
Why HELLO Amazon! You weren't here last time
Next Windows obsolescence panic is 450 days from … NOW!
The clock is ticking louder for Windows Server 2003 R2 users
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Half of Twitter's 'active users' are SILENT STALKERS
Nearly 50% have NEVER tweeted a word
Microsoft TIER SMEAR changes app prices whether devs ask or not
Some go up, some go down, Redmond goes silent
Batten down the hatches, Ubuntu 14.04 LTS due in TWO DAYS
Admins dab straining server brows in advance of Trusty Tahr's long-term support landing
Red Hat to ship RHEL 7 release candidate with a taste of container tech
Grab 'near-final' version of next Enterprise Linux next week
Windows 8.1, which you probably haven't upgraded to yet, ALREADY OBSOLETE
Pre-Update versions of new Windows version will no longer support patches
Inside the Hekaton: SQL Server 2014's database engine deconstructed
Nadella's database sqares the circle of cheap memory vs speed
prev story

Whitepapers

Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.
Combat fraud and increase customer satisfaction
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
SANS - Survey on application security programs
In this whitepaper learn about the state of application security programs and practices of 488 surveyed respondents, and discover how mature and effective these programs are.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.