Feeds

Pivotal: So who fancies skinny-dipping in our 'Business Data Lake'? PS: It'll cost you

Tries to tempt more into its Hadoop pond with Pivotal HD 2

Next gen security for virtualised datacentres

EMC and VMware's spin-off Pivotal is trying to make money from a bevy of open-source big-data-crunching technologies by making their most complicated aspects disappear.

With the launch of Hadoop distribution Pivotal HD 2.0 and analytics engine Pivotal GemFire XD on Monday, the company has created a clutch of technologies that let organizations make sense of large amounts of data without having to deal in the hard-to-master traditional Hadoop interfaces.

The two bits of software together give companies an in-memory SQL data store that sits upon data stored in the Hadoop File System (HDFS), and an engine called HAWQ that can query data via SQL.

In addition, HAWQ has gained integration with GraphLab OpenMPI and MADlib – software that gives it an integrated set of analytical algorithms for analyzing relational data and also graph analytics. It has also added compatibility for queries in R, Python, and Java as well.

Pivotal said in a press release that the launch of these products would "constitute the foundation for the Business Data Lake architecture" – this language caused our brain to melt and leak out of our noses in the form of a sad beige paste.

Marketing-aside, the technology is Pivotal's response to the proliferation of structured and unstructured data types within an organization. Pivotal hopes that companies will spend big for the pleasure of being able to ingest, store, analyze and query data within a single stack of integrated software modules. (Pivotal doesn't disclose prices and encourages people to get in touch, so it's using a how much can you afford enterprise pricing strategy.)

PivotalHD

Pivotal's gooey blend of open source and proprietary tech

It has done this by bolting numerous additions on top of the stock Hadoop distribution, and it reckons that its modular architecture will help it update the software over time without transforming the application's codebase into a gigantic ball of spaghetti.

Future areas of development for the software include multi-tenancy, enhanced Hadoop, and adding in other open-source projects such as Apache Spark.

"Multi-tenancy is a major theme we want to focus on next [by] allowing different groups and lines of users, and different types of workload requirements, to work with our data fabric," explained Anant Chintamaneni, Pivotal's director of product management for the Pivotal HD stack, in a chat with The Register.

Though the main changes to Pivotal HD are about putting proprietary, easy-to-use data querying and analysis interfaces on top of Hadoop, Chintamaneni confirmed that initial Hadoop installations are still troublesome for some organizations.

"When a customer deploys it, our guidance always is to plan," he said. "You do have to plan exactly how you're going to lay it out. If you want to run HBase and HAWQ on same cluster you need to apportion the cluster accordingly."

The company may even further its open-source contributions, he confirmed, saying the company is thinking of contributing some technology into Hadoop's advanced YARN job scheduler.

Hadoop may be a darling of sophisticated engineering-heavy companies, but Pivotal is betting that by making it easier to use, it may finally be able to make some cash out of the open-source project. ®

5 things you didn’t know about cloud backup

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Oracle reveals 32-core, 10 BEEELLION-transistor SPARC M7
New chip scales to 1024 cores, 8192 threads 64 TB RAM, at speeds over 3.6GHz
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Docker kicks KVM's butt in IBM tests
Big Blue finds containers are speedy, but may not have much room to improve
US regulators OK sale of IBM's x86 server biz to Lenovo
Now all that remains is for gov't offices to ban the boxes
Gartner's Special Report: Should you believe the hype?
Enough hot air to carry a balloon to the Moon
Flash could be CHEAPER than SAS DISK? Come off it, NetApp
Stats analysis reckons we'll hit that point in just three years
Dell The Man shrieks: 'We've got a Bitcoin order, we've got a Bitcoin order'
$50k of PowerEdge servers? That'll be 85 coins in digi-dosh
prev story

Whitepapers

Endpoint data privacy in the cloud is easier than you think
Innovations in encryption and storage resolve issues of data privacy and key requirements for companies to look for in a solution.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Top 8 considerations to enable and simplify mobility
In this whitepaper learn how to successfully add mobile capabilities simply and cost effectively.
Solving today's distributed Big Data backup challenges
Enable IT efficiency and allow a firm to access and reuse corporate information for competitive advantage, ultimately changing business outcomes.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.