Feeds

Scality commits to Big Data, puts a RING on Hadoop elephant

Also adds plug-in for OpenStack's Cinder

Top 5 reasons to deploy VMware with Tegile

Object storage start-up Scality has added its storage to Hadoop so users can avoid loading data through Hadoop's own file system. It has also unveiled a plug-in for Cinder, the block storage layer within the OpenStack project.

The RING is an object storage infrastructure based on a set of X86 server nodes that store objects, not files or blocks, and can operate in parallel.

Scality has produced what it calls a "production-grade Hadoop storage implementation" using CDMI, the cloud-oriented data management standard for cloud storage developed and promoted by the SNIA. CDMI support by vendors started slowly but is picking up pace.

Scality has replaced the Hadoop Name Node server with its own metadata architecture, and thereby eliminated the single-point-of-failure in Hadoop’s architecture. The company says its Hadoop implementation of Hadoop enables in-place processing, compute on the storage node itself, and significantly reduces the need for data transfer by being able to share data location with the Job Tracker.

Scality says that its RING's erasure coding means any Hadoop hardware overhead due to replication is obviated. Also "users can write and read files through a standard file system, and at the same time process the content with Hadoop, without needing to load the files through HDFS, the Hadoop Distributed File System".

Jerome Lecat, Scality's CEO, said: "We have contributed our Hadoop solution to the CDMI community, ensuring that it can be used with any CDMI-compatible storage. … Our CDMI framework can read data directly from our scale-out file system, it is not necessary to do an HDFS ingest prior to performing a MapReduce job.”

The Scality offering is compatible with, and has been tested with, Hortonworks HDP 1.0 and Cloudera CDH4 - it doesn't appear that Scality is looking to replace or compete with existing Hadoop distributions. By adding a RING back end, as it were, Scality says it produces a more cost effective, easier-to-use, more resilient and higher performance Hadoop infrastructure, with users benefitting from Scality's SOFS (Scale-Out File System).

Lecat said: "Our angle is that we think that people will want to be able to do Hadoop job on 'normal' data, not just what they specifically prepared for Hadoop. In my mind, this is the very advantage of Hadoop, but it is killed by the fact that people need to do an HDFS ingest before any MapReduce job. Not with us anymore."

An implication is this, Lecat says: "Just imagine what you can do if you now use MapReduce – which is working on the storage nodes themselves – to do data transformation, like new encoding, as a new versions comes out. This saves a lot of processing time. It used to be necessary to move the data from storage to a server, do the transformation and then write it back on storage."

OpenStack Object Storage

Open Stack is a cloud or Infrastructure-as-a-Service (IAAS) based on free, open-source software to control pools of compute, storage and networking resources in a data centre with users self-provisioning through a portal and admin staff managing the whole caboodle through as dashboard. Rackspace and many, many other suppliers have actively and vocally supported OpenStack. Now Scality has jumped aboard the OpenStack roundabout.

Cinder is the code-name for a block storage layer in OpenStack that enables virtual machines (VMs) to discover and use persistent block volumes, and Scality has provided a RING plug-in for it. Lecat said: "This contribution enables OpenStack adopters to catch up with Amazon EBS persistent volumes for virtual machines. With the Grizzly release, OpenStack Compute will have [a] storage companion, to be deployed in high demand, cloud computing environments. It will boost the market adoption for OpenStack.”

Grizzly is the next release of OpenStack that's scheduled for release in April.

Scality is not alone. Coraid has also contributed drivers for its ATA-over-Ethernet (AoE) and Coraid EtherCloud to the OpenStack Cinder block storage open source project so OpenStackers can use its storage arrays for block storage. All-flash cloud storage array startup SolidFire SolidFire has done the same, and it has been involved on Project Cinder for several years now. Coraid claims legacy storage providers like NetApp, EMC, HP, and Dell only have partially completed functions in their OpenStack drivers, and it has joined the OpenStack community as a corporate sponsor.

The RING deal for OpenStack offers a POSIX file interface via a Scale Out File System (SOFS) package. Scality states:

The Cinder integration is built on Scality’s … distributed sparse file technology embedded in SOFS. Each Cinder volume is effectively a file inside Scality scale-out storage. This ensures easy management, seamless scalability and enables advanced virtualisation features such as live migrations of virtual machines and instant failover in case of compute node hardware failure.

Philippe Nicolas, Scality's Director of Product Strategy, said: “This block storage interface completes our Unified Storage strategy. Scality is one of the first players to actually deliver on the promise of true and complete unified storage access, including object, file and now block.”

Scality’s Cinder integration will be available with OpenStack’s Grizzly release. ®

Choosing a cloud hosting partner with confidence

More from The Register

next story
Just don't blame Bono! Apple iTunes music sales PLUMMET
Cupertino revenue hit by cheapo downloads, says report
The DRUGSTORES DON'T WORK, CVS makes IT WORSE ... for Apple Pay
Goog Wallet apparently also spurned in NFC lockdown
Cray-cray Met Office spaffs £97m on VERY AVERAGE HPC box
Only 250th most powerful in the world? Bring back Michael Fish
Microsoft brings the CLOUD that GOES ON FOREVER
Sky's the limit with unrestricted space in the cloud
'ANYTHING BUT STABLE' Netflix suffers BIG Europe-wide outage
Friday night LIVE? Nope. The only thing streaming are tears down my face
IBM, backing away from hardware? NEVER!
Don't be so sure, so-surers
Google roolz! Nest buys Revolv, KILLS new sales of home hub
Take my temperature, I'm feeling a little bit dizzy
prev story

Whitepapers

Choosing cloud Backup services
Demystify how you can address your data protection needs in your small- to medium-sized business and select the best online backup service to meet your needs.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
New hybrid storage solutions
Tackling data challenges through emerging hybrid storage solutions that enable optimum database performance whilst managing costs and increasingly large data stores.
Beginner's guide to SSL certificates
De-mystify the technology involved and give you the information you need to make the best decision when considering your online security options.