Hadoop for joy? ODPi 2.0's available, but questions persist
Welcome focus on Hive standards... but how many more components to go?
The Hadoop standardisations body, ODPi, has today made its 2.0 release generally available.
Formerly the Open Data Platform initiative, now known only as ODPi, the second iteration of the Hadoop interoperability program will include updated specifications for runtime. Where its previous release had established standards for core parts of Hadoop – including Yarn, MapReduce, and HDFS – it now also supports Apache Hive and Hadoop Compatible File System components.
Additionally, the ODPi Operations Specification 1.0 is released, providing guidelines for application management tools with Ambari as a reference platform.
The idea is that by “providing common expectations in guidelines, developers are able to create data-driven applications for all management tools used by platform providers” while minimising the “complexity, cost and training needed to build big data applications”.
Intended to standardise the development model for those flogging big data products and related applications, the initiative has continued to be a source of contention for those working in the Hadoop space, with Cloudera and MapR criticising the consortium of companies behind the project. Other critics claim Hadoop vendor Hortonworks appears to benefit most from the project rather than the ecosystem as a whole.
Talking to The Register, Charaka Goonatilake, CTO at Panaseer, which offers Hadoop-based security analytics software, said he admired "the community effort to drive standardisation as it's sorely needed in the heavily fragmented Hadoop industry."
However, Goonatilake added: "In the absence of support from the Hadoop powerhouses, Cloudera and MapR, I believe it'll ultimately be a futile initiative."
Talking to The Register after announcing the involvement of IBM, WanDisco and DataTorrent, the ODPi's John Mertic said that whatever questions were raised when the initiative was first launched, the changes in its governance and direction have made it more likely than before that Cloudera and MapR might change their minds about refusing to be involved.
There are other issues, however, as Goonatilake told us:
ODPi currently doesn't go broad enough (the Hadoop ecosystem consists of dozens of projects but ODPi only addresses a handful of these) nor does it go deep enough (a single software patch can be the difference between make or break for an application running across different Hadoop distributions).
As a Hadoop application vendor, ODPi, in its current form, is not going to help me write and test my product once and be confident it's going to interoperate across the major Hadoop distributions."
ODPi membership now includes more than 30 companies. "ODPi's work to ensure interoperability of applications across a wide range of commercial Hadoop platforms is gaining momentum thanks to ongoing membership growth," said John Mertic, director of program management, ODPi. ®