This article is more than 1 year old

EMC grabs tutu, slips on ballet shoes, performs Tchaikovsky's SAN Lake

When you need to soak up petabytes of data, maybe you should splash out on this kit

EMC is getting into big data hydraulic engineering with its Federation Business Data Lake, calling it an engineered system integrating components from EMC II, Pivotal, VCE and VMware.

This is not an appliance – the 1PB version takes at least seven days* to implement – so you can assume it’s not cheap either. It combines big data storage, analytics processing, and an analytics toolbox in a complex eight-layered product suite, combining gear from four EMC federation members.

EMC_Federation_Business_Data_Lake

EMC Federation Business Data Lake diagram. The servers are in the Vblocks. Isilon provides the storage foundation.

We’re told that Federation Business Data Lake (FBDL) components include “EMC storage, VMware vSphere running on Vblocks, the Pivotal Big Data Suite and Pivotal Cloud Foundry. … [It] supports a robust ecosystem of third-party analytics products including Tableau and SAS.” There are “predefined analytics use cases with automated provisioning and configuration.”

The analytics layer is the Pivotal Big Data Suite, including PivotalHD, featuring the HAWQ SQL-on-Hadoop engine. It also provides enterprise-class SQL, enabling integration and interoperability with analytics platforms such as SAS and Tableau, over data stored in Hadoop.

FBDL variations will let customers use Cloudera and Hortonworks software if they wish with two coming FBDL configurations. Any future Open Data Platform-based Hadoop distribution will also be supported.

Cloudera says EMC, with this FBDL announcement, is helping to make the Hadoop ecosystem enterprise ready.

In EMC’s view “a Business Data Lake contains structured and unstructured data from a wide variety of sources and the analytics are focused on building models to predict the future.”

EMC says there is a roadmap for FBDL development, and by the end of the year it should have added:

  • Data ingest.
  • Indexing for data in and outside the Data Lake.
  • Information governance to enforce policies. E.g. which users in particular can access specific data sets.
  • Self-service capabilities for analysts to build an environment and select relevant data sets.

There is a set of services to go with FBDL covering technology on-boarding, a proof-of-value service to model ROI, a big data Vision workshop, and general big data education services.

FBDL, with Pivotal, Hortonworks or Cloudera, will have limited (directed) availability in April. No pricing details were available but El Reg suggests you start your thinking at half a million bucks and gaze upwards. ®

* Those seven days include deployment of converged infrastructure, Hadoop, structured data, and real-time analytics tools so data can be analyzed.

More about

More about

More about

TIP US OFF

Send us news


Other stories you might like