Feeds

Cloudera promises 'Google-like' Big Data dream in minutes

Hadoop shop automates so you don't have to

The essential guide to IT transformation

Updated Cloudera has delivered a "substantial" update to its open source Hadoop distribution.

On Wednesday, Cloudera rolled out Cloudera Enterprise 3.5, two months after shipping a major upgrade to its Hadoop distribution called Cloudera Distribution of Apache Hadoop (CDH) 3.0.

Whereas CDH 3.0 expanded Cloudera's Hadoop stack from three components to 10, the idea behind Cloudera Enterprise 3.5 is to make that distro easier to manage and deploy for IT shops outside the ranks of Hadoop's super-users like Facebook, Yahoo!, and LinkedIn. Cloudera provides services and support for such mainstream users.

The changes will let you install and configure a full Google-like infrastructure "in a couple of minutes", product vice president Charles Zedlewski told The Register. "We have done a substantial update." Hadoop is based on Google's GFS and MapReduce platforms.

Zedlewski said Cloudera Enterprise 3.5 automates configuration changes, service restarts, and the addition and removal of hardware. There's also an Activity Monitor that consolidates user activity across components to provide both a real-time and historical view of user activities and jobs.

"[We have] expanded the capability of the management suite from monitoring and discovery of issues, to diagnostic problems, to automating changes, and setting long-term changes," Zedlewski said.

Cloudera has also enhanced the Hadoop Resource and Authorization Manager, facilitating rollbacks and improving security with LDAP systems.

Hadoop is an architecture for crunching huge amounts of data using a network of distributed servers. Nutch web crawler creator Doug Cutting based the platform on research papers describing Google GFS and MapReduce., and it is now an Apache Software Foundation (ASF) project.

Today, Cloudera also released a free "Express" edition of its Service and Configuration Manager module used in Cloudera Enterprise 3.5 that will automate the installation and configuration of Hadoop on a cluster of up to 50 nodes. Meanwhile, the company has also donated code for its packaging and testing suite to ASF, under a project called Bigtop. The idea is to help improve packaging and interoperability testing for Hadoop and related modules.

Among Bigtop's initial committers is Canonical, chief commercial steward of Ubuntu. Cloudera has supported packaging for Ubuntu Linux for the last two-and-a-half years.

Zedlewski said Cloudera will add a further three or four modules to the current stack, among them a compression algorithm that leverages Google's Snappy to speed up data import and export.

This will be added in an update to CDH in the next month or so, Zedlewski said. Other components are due in "the CDH 4 timeframe", he said, while Cloudera is also looking at enhancements around high-availability features in the core Hadoop module. Zedlewski would not provide a date for CDH 4, but he said "work is well underway". ®

This article has been updated to clarify Cloudera has delivered Cloudera Enterprise 3.5.

5 things you didn’t know about cloud backup

More from The Register

next story
6 Obvious Reasons Why Facebook Will Ban This Article (Thank God)
Clampdown on clickbait ... and El Reg is OK with this
No, thank you. I will not code for the Caliphate
Some assignments, even the Bongster decline must
Barnes & Noble: Swallow a Samsung Nook tablet, please ... pretty please
Novelslab finally on sale with ($199 - $20) price tag
Mozilla's 'Tiles' ads debut in new Firefox nightlies
You can try turning them off and on again
Banking apps: Handy, can grab all your money... and RIDDLED with coding flaws
Yep, that one place you'd hoped you wouldn't find 'em
Video of US journalist 'beheading' pulled from social media
Yanked footage featured British-accented attacker and US journo James Foley
TROLL SLAYER Google grabs $1.3 MEEELLION in patent counter-suit
Chocolate Factory hits back at firm for suing customers
Primetime precrime? Minority Report TV series 'being developed'
I have to know. I have to find out what happened to my life
prev story

Whitepapers

Top 10 endpoint backup mistakes
Avoid the ten endpoint backup mistakes to ensure that your critical corporate data is protected and end user productivity is improved.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Backing up distributed data
Eliminating the redundant use of bandwidth and storage capacity and application consolidation in the modern data center.
The essential guide to IT transformation
ServiceNow discusses three IT transformations that can help CIOs automate IT services to transform IT and the enterprise
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.