Feeds

Business intelligence startup tarts up Hadoop for managers

Platfora puts lipstick on the elephant

Secure remote control for conventional and virtual desktops

For the world's most lauded open source data platform, Hadoop is remarkably difficult to use, so Tuesday brings another company slinging a tool that entices managers and analysts into fiddling with the elephant.

This time it's analytics startup Platfora with the general release of its in-memory business intelligence layer atop Hadoop. Unlike rival BI engines, Platfora lets you interrogate your Hadoop-stored data via a graphical user interface – no need for terminal here, folks*.

Platfora is an "exploratory BI interface in the spirit of Tableau, Spotfire [but] built natively for the [Hadoop] stack. ... the primary interface is definitely a visual way of working with data," Platfora chief and former head of products for EMC Greenplum Ben Werther told The Register.

The company's plan to make Hadoop as easy to query as possible has struck a chord with the venture capital community, who smelled money and pumped $20m into the company in November, 2012.

Its GUI-heavy approach stands out from other methods of interrogating HDFS. Alternate tools designed to make the obtuse platform accessible work either by layering a SQL engine on top of Hadoop (Concurrent, EMC/Greenplum's HAWQ), making do with the worthy-but-clumsy Hive (Intel), or by pulling the data into another more friendly analytics system, such as ParAccel.

Though these systems can be useful – and in the case of Cloudera's query layer Impala or EMC/Greenplum's Hawq, much faster – they lack the ease-of-use features of Platfora, Werther says.

Platforma can also be accessed via SQL-like and JSON-like APIs, but this is not the priority, he said.

The technology also competes with standard BI tools such as Tableau, Qlikview, and Tibco Spotfire. "These are all fine solutions in a traditional SQL world," Werther says. "They claim they want to be Hadoop and work in a Hadoop world, but they don't have any of the architecture necessary to make this a first-class experience."

Platfora integrates directly with Hadoop, so companies do not need to suck the data into another ETL or data warehouse, he explained.

The technology has three layers – the web-based exploratory BI layer, a scale-out columnar-compressed in-memory engine, and the Hadoop data refinery which runs MapReduce jobs across HDFS data.

Platfora works by grabbing samples of data from HDFS to create a catalog that can be accessed via the web GUI. The system can handle delimited data, AVRO JSON, log records, regex-parseable data, and "other formats," Werther said. When users select the particular data they want to analyse, the system will plan a series of MapReduce jobs to spew data into a partitioned, columnar-compressed dimensional data mart – Platfora calls this a "lens" – which runs automatically. When this is done, the resultant blocks of data are pulled into the Platfora nodes and triple-replicated across disks for redundancy, then when a user makes a query the pieces are pulled into memory.

Perhaps the technology most similar to Platfora is SAP HANA, with both companies having the same belief about analytics – if you can, do it from memory. However, SAP is focused on bridging SAP transactional data and keeping all of it in memory, Werther said, while Platfora is more about providing a way to interface with a massive pool of HDFS data and selectively load it into memory.

The company has no special plans for an intermediary storage layer, like flash, Werther said. Pricing is done on a per-node basis, but was not disclosed.

There's a feeling brewing among users and developers that big-data tools cost too much and do too little, probably emanating from the eye-watering salaries needed to support Hadoop-whisperers and the fact that although these people may speak HDFS, they might not be the best at designing queries for it. Platfora's strategy of making money by prettying-up Hadoop is representative of the overall big-data industry, which is waking up to the fact that if HDFS truly is becoming the all-purpose storage format for ingested data, then there's money to be made by designing tools to let more people analyse it. ®

*Bootnote

This begs the question as to how easy-to-use a data analysis system needs to be – after all, nothing is more dangerous for an organization than the pointy-haired denizens of the upper floors suddenly being able to query all stored data and develop opinions about what the business should really be doing, right?

Boost IT visibility and business value

More from The Register

next story
Why has the web gone to hell? Market chaos and HUMAN NATURE
Tim Berners-Lee isn't happy, but we should be
Microsoft boots 1,500 dodgy apps from the Windows Store
DEVELOPERS! DEVELOPERS! DEVELOPERS! Naughty, misleading developers!
'Stop dissing Google or quit': OK, I quit, says Code Club co-founder
And now a message from our sponsors: 'STFU or else'
Apple promises to lift Curse of the Drained iPhone 5 Battery
Have you tried turning it off and...? Never mind, here's a replacement
Linux turns 23 and Linus Torvalds celebrates as only he can
No, not with swearing, but by controlling the release cycle
Scratched PC-dispatch patch patched, hatched in batch rematch
Windows security update fixed after triggering blue screens (and screams) of death
This is how I set about making a fortune with my own startup
Would you leave your well-paid job to chase your dream?
prev story

Whitepapers

Top 10 endpoint backup mistakes
Avoid the ten endpoint backup mistakes to ensure that your critical corporate data is protected and end user productivity is improved.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Backing up distributed data
Eliminating the redundant use of bandwidth and storage capacity and application consolidation in the modern data center.
The essential guide to IT transformation
ServiceNow discusses three IT transformations that can help CIOs automate IT services to transform IT and the enterprise
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.