Storage startup busts object location barrier

Scality's Ring cycle: One Ring to find them all

Top 5 reasons to deploy VMware with Tegile

File system trees are inefficient and slow when locating files in a filespace occupied by billions of files and folders. Storing the data as objects in a flat storage space is becoming a recommended alternative. But, as soon as you go for object storage to defeat this file system tree traverse problem, you face a fresh problem: how do you locate your objects?

Either you have a central object map or database or you have a distributed one. French startup Scality has gone for the distributed approach with its Ring technology.

The idea is to have virtual unlimited scalability, both of I/O and storage capacity, by using clustered commodity X86 servers organised as peer-to-peer nodes – conceptually occupying  a ring – with front end Accessor software nodes receiving requests from users and applications on servers.

Scality CEO Jerome Lecat says an Accessor node can access any Ring node, note the "any", and find the right node storing a requested object in one network hop with 10 nodes, two hops with 100 nodes, and three hops with 1,000 nodes.

Scality Ring technology

Holy Trinity in Scality's Ring technology: Accessors to the left, the Ring in the middle and secondary storage to the right.

A variety of Accessor node technologies are supported: native REST HTTP, NFS, BRS2 and Zimbra.

With each 10X increase in the Ring node count, the hop count goes up by one because of Scality's patented technology and its algorithm. We might call this a quite peculiar Ring cycle.

Lecat said: "There are really two 'tricks' here. [First] an algorithm delivering a maximum of Log(n) complexity – which basically gives one a 100-node network. Each node needs to know seven nodes, and a request may take seven hops. The minimum requirement from a mathematical standpoint is for each node to know a few other nodes. The number of nodes increases as Log(number of nodes), which means that when the number of nodes is x10, you need to add 1 to the number of nodes to be known, or number of hops.

"[Secondly] in practice, we allow nodes to know many more nodes, but this acts as a 'non authoritative cache', and it allows for a request to 'usually' converge in two hops, while keeping all the mathematical properties of the model (Log complexity, limited number of hops, good behaviour when a node is lost or added)."

Each node can handle 10 to 50TB of storage, with 1,000 nodes supporting up to 50PB of capacity, and accessing the right object in that 50PB with three hops on a gigabit LAN takes 20ms or less.

Distributed hash table

How does that work? Scality documentation says that the Ring nodes are organised into segments. Objects are stored with a Distributed Hash Table (DHT) algorithm, which produces a value for the object and its associated key. Key and value pairs are stored in the DHT and nodes retrieve the value associated with a particular key. Responsibility for maintaining the mapping from keys to values is distributed among the nodes. Keys embed information about class of service, and each node is autonomous and responsible for consistency checking and rebuilding replicas automatically for its keys.

We can think in terms of Scality's Ring nodes crossing a key space. This is organised into a hierarchy such that a 10-node ring requires one node-to-node hop to find the target node, a 100-node ring needs two hops and a 1,000-node monster needs three hops.

Lecat says: "The key space is distributed among all the nodes. The key space is very large (20 bytes), and distributed nearly evenly, but never exactly evenly. The underlying algorithm is a distributed hash table. The 'segments' do not have a constant size (as everything has to be dynamic in the system to allow real elasticity).

"Two key properties of the key space are that keys have an order, and they are organised into a circle (which gives trigonometic properties)."

Let's take a 10-node Ring as an example. An Accessor sends in a object retrieval request to node 1, which doesn't have it. We're told the object can be retrieved with one hop, a jump from node 1 to the right node. Node 1 has enough information to send the request on to the right node, the one that holds the object, and so does every other node in the 10-node ring: that's how a distributed hash table works.

Scality doesn't say in detail how this works. I think it is a variation on this concept: each node has an ID and nodes are organised in a ring, a double-linked list, with each node having a reference to the previous node on the ring, its address, and the next ring node, and its address. Nodes going round the ring have successively greater node IDs until you return to the starting node.

Okay? Keep that in mind and let's move on to the request receiving node, which gets the key from the Accessor request and hashes it to generate a key of exactly the same number of bits as the node reference. The system uses this as a node ID and goes round the ring node by node, looking for a node ID that is the closest possible to the key hash while still being larger. That node should store the desired object.

A reversal of this is used to store incoming objects on the Ring and ensure they are locatable.

Lecat said: "If a node is lost, the ring rebalances itself without human intervention. [It's the] same if a ring node is added (human intervention needed to decide to add a node), the new node is automatically placed well in the key space, and rebalances only occur when necessary and automatically."

To understand any more than this requires a computer science skill set and access to the Scality Ring designers.

Beginner's guide to SSL certificates

More from The Register

next story
It's Big, it's Blue... it's simply FABLESS! IBM's chip-free future
Or why the reversal of globalisation ain't gonna 'appen
'Hmm, why CAN'T I run a water pipe through that rack of media servers?'
Leaving Las Vegas for Armenia kludging and Dubai dune bashing
Microsoft and Dell’s cloud in a box: Instant Azure for the data centre
A less painful way to run Microsoft’s private cloud
Facebook slurps 'paste sites' for STOLEN passwords, sprinkles on hash and salt
Zuck's ad empire DOESN'T see details in plain text. Phew!
Windows 10: Forget Cloudobile, put Security and Privacy First
But - dammit - It would be insane to say 'don't collect, because NSA'
CAGE MATCH: Microsoft, Dell open co-located bit barns in Oz
Whole new species of XaaS spawning in the antipodes
prev story


Cloud and hybrid-cloud data protection for VMware
Learn how quick and easy it is to configure backups and perform restores for VMware environments.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Three 1TB solid state scorchers up for grabs
Big SSDs can be expensive but think big and think free because you could be the lucky winner of one of three 1TB Samsung SSD 840 EVO drives that we’re giving away worth over £300 apiece.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.