Feeds

Elders tell cluster tool Apache Spark it's time to quit chillin' in the crib

Hadoop Swiss Army knife software graduates from Incubator to full-blown project

Beginner's guide to SSL certificates

The Apache Foundation has promoted a fast data-processing tool out of the Apache Incubator in a further sign of the maturity of the Hadoop family.

Apache Spark is a fast processing layer for computing data stored within the open-source Hadoop file system or other shared file systems such as NFS. It supports Scala, Java, and Python. In some tests it has demonstrated a speedup of 100 times over Hadoop when dealing with in-memory sets, and 10 times for hard-disk-held data.

On Sunday, Spark was unanimously voted to graduate from the Incubator, and some of those voting included Hadoop luminaries such as the technology's creator Doug Cutting.

Now that Spark has been promoted, a project management committee will be established for the software, and Databricks co-founder and former AMP Lab PHD student Matei Zaharia will be appointed to the role of 'Vice President, Apache Spark".

Like Hadoop, Spark has become the foundation for other data-processing engines as well, such as Shark for SQL-on-Hadoop queries, MLib for machine learning, Spark Streaming for dealing with streaming data, and GraphX for graph processing.

Some of the technology's users include Baidu, Databricks, IBM's Almaden research group, TrendMicro, Yahoo! and Alibaba.

The graduation of Apache Spark caps off a vertiginous rise for the data-processing system, which was created at the University of California at Berkeley's AMPLab in 2009 and was published as open source in 2010.

Since then, the system has gained a vigorous developer community, and more than 120 developers from 25 companies contribute source code. There seems to be enough activity around the software for businesses to smell money – as last week Hadoop hothouse Cloudera announced commercial support for the tool. ®

Internet Security Threat Report 2014

More from The Register

next story
NSA SOURCE CODE LEAK: Information slurp tools to appear online
Now you can run your own intelligence agency
Azure TITSUP caused by INFINITE LOOP
Fat fingered geo-block kept Aussies in the dark
NASA launches new climate model at SC14
75 days of supercomputing later ...
Yahoo! blames! MONSTER! email! OUTAGE! on! CUT! CABLE! bungle!
Weekend woe for BT as telco struggles to restore service
Cloud unicorns are extinct so DiData cloud mess was YOUR fault
Applications need to be built to handle TITSUP incidents
BOFH: WHERE did this 'fax-enabled' printer UPGRADE come from?
Don't worry about that cable, it's part of the config
Stop the IoT revolution! We need to figure out packet sizes first
Researchers test 802.15.4 and find we know nuh-think! about large scale sensor network ops
SanDisk vows: We'll have a 16TB SSD WHOPPER by 2016
Flash WORM has a serious use for archived photos and videos
Astro-boffins start opening universe simulation data
Got a supercomputer? Want to simulate a universe? Here you go
prev story

Whitepapers

Driving business with continuous operational intelligence
Introducing an innovative approach offered by ExtraHop for producing continuous operational intelligence.
Why CIOs should rethink endpoint data protection in the age of mobility
Assessing trends in data protection, specifically with respect to mobile devices, BYOD, and remote employees.
Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Mitigating web security risk with SSL certificates
Web-based systems are essential tools for running business processes and delivering services to customers.