Microsoft tries to Spark relationship with cluster lusters: Promises 5-min big data bang on Azure

Aims to have Apache Spark running in time it takes to make cuppa

Thu 5 Oct 2017 // 15:19 UTC

First apps on Windows, then Linuxes in Hyper-V and on Azure, now big data via Spark. In another effort to win over the open source crowd, Microsoft has made the speedy big data engine Apache Spark easier to set up and use on Azure, giving devs a dedicated tool to help provision clusters.

The open-source "Azure Distributed Data Engineering toolkit", which integrates with Docker containers, enables devs to submit jobs and provision on-demand Spark clusters from the command line.

Apache Spark boasts that it can "run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk". The tricky part can be configuration – companies such as MemSQL have approached this by offering a way to use it without writing code.

"Spinning up a Spark cluster, on-demand, can often be complicated and slow," Microsoft program manager JS Tan wrote in an Azure blog post. "Spark developers often share [static] pre-existing clusters managed by their company's IT team," which means "you're either out of capacity, or you're burning dollars on idle nodes."

The new toolkit, based on Azure Batch, can provision a cluster within three to five minutes, according to Microsoft. As usual, you'd still pay for the cores you use.

For now, this toolkit is Spark-specific. But Tan added: "We plan to support other distributed data engineering frameworks in a similar vein." ®

Topics

Special Features

Vendor Voice

Resources

SaaS

Microsoft tries to Spark relationship with cluster lusters: Promises 5-min big data bang on Azure

Aims to have Apache Spark running in time it takes to make cuppa

More about

More about

Narrower topics

Broader topics

More about

More about

More about

Narrower topics

Broader topics

TIP US OFF

Other stories you might like

Microsoft slammed for lax security that led to China's cyber-raid on Exchange Online

French lawmakers take a swing at cloud monopolies

Researchers claim Windows Defender can be fooled into deleting databases

Getting on board with AI

October 2025 will be a support massacre for a bunch of Microsoft products

Microsoft breach allowed Russian spies to steal emails from US government

Microsoft is a national security threat, says ex-White House cyber policy director

Open source versus Microsoft: The new rebellion begins

Cloud Software Group and Microsoft pledge another eight years of co-opetition

AI gold rush continues as Microsoft invests $1.5B in UAE's G42

US government excoriates Microsoft for 'avoidable errors' but keeps paying for its products

Microsoft aims to triple datacenter capacity to fuel AI boom

About Us

Our Websites

Your Privacy