Software

This article is more than 1 year old

Hadoop spinner Cloudera lights Spark on MapReduce retirement

Big Data pioneer succumbs to mounting pressure from the crowd

Wed 9 Sep 2015 // 14:19 UTC

Cloudera, one of the Big Data pioneers founded on Hadoop – the open-source implementation of Google’s MapReduce – is replacing... MapReduce.

On Wednesday, the firm announced the One Platform Initiative, which will see it substitute MapReduce for the Apache Software Foundation’s Spark, a cluster-computing framework which has attracted big-name support.

Cloudera said its initiative would let Spark become the successor to Hadoop’s MapReduce framework for general Hadoop data processing. The firm claimed “wide adoption” of Spark among its customers in the last 18 months, with Spark also becoming the most popular open-source project in the Hadoop ecosystem.

Cloudera founder and chief strategy officer Mike Olson said in a statement:

Spark is well on its way to succeeding MapReduce in enabling jobs with hundreds of executors each, running simultaneously on large multi-tenant clusters with tens of thousands of nodes – but there is still some heavy lifting to do.

It's an ambitious goal, but with the community of committers and supporters, and our leadership, we think that's highly achievable.

Cloudera is no recent convert to Spark. The firm – along with IBM, Intel, DataBricks, and MapR – last year announced a collaboration to port the Apache Hive data warehouse to Apache Spark. The One Platform Initiative will tackle security, scale, management and streaming.

Hadoop was actually developed by Cloudera's Doug Cutting, along with Mike Cafarella, as a project at their employer Yahoo! in 2005 and released in 2011. They'd used a paper on MapReduce released by that framework's owner, Google.

Spark was developed by the AMPLab at the University of California, Berkeley, and open-souced under a BSD license in 2010, before being donated to ASF in 2013. It employs Spark SQL, streaming, a machine-learning framework called MLlib and a GraphX distributed graph processing framework.

It’s seen growing support from many in the Hadoop and MapReduce NoSQL ecosystem: NoSQL provider MapR announced Spark-based offerings for security, analytics and Genome sequencing software and Cloudera’s fellow Hadoop spinner Hortonworks released Spark as a part of its Hortonworks Data Platform in April.

But it was IBM who in June lent Spark what will be seen as its breakthrough: the giant announced “a major commitment” to Apache Spark by committing to embed the framework in its analytics and commerce software. Big Blue also pledged to donate its SystemML machine-learning technology to the Spark ecosystem and offer Spark on its cloud-as-a-service on IBM Bluemix.

MapReduce, and thus Hadoop, have long been under pressure for their complexity and lack of flexibility and performance, with many wondering what would come next.

Spark is seen as faster – able to process jobs between 10 to 100 times faster than MapReduce – and better for iterative and interactive processing, while it is able to run not just on Hadoop but also other Hadoopy tools such as Hive and Pig. ®

Topics

Special Features

Vendor Voice

Resources

Software

Hadoop spinner Cloudera lights Spark on MapReduce retirement

Big Data pioneer succumbs to mounting pressure from the crowd

More about

More about

Narrower topics

Broader topics

More about

More about

More about

Narrower topics

Broader topics

TIP US OFF

Other stories you might like

Google fires 28 staff after sit-in protest against Israeli cloud deal ends in arrests

Google One VPN axed for everyone but Pixel loyalists ... for now

Google will delete data collected from 'private' browsing

Getting on board with AI

Google location tracking deal could be derailed by politics

Google joins the custom server CPU crowd with Arm-based Axion chips

Google laying off staff again and moving some roles to 'hubs,' freeing up cash for AI investments

Google will pump more than $100B into AI, says DeepMind boss

Japan turns up heat on Apple, Google with threat of hefty fines

Google sues app devs, claims they're Play Store crypto scammers with 100k+ victims

AI spam is winning the battle against search engine quality

Google plunks down $1 billion for extra Japan-US submarine cable

About Us

Our Websites

Your Privacy