Protected? Cosy? Pffft, Reduxio prefer 'daring stupidity'

Israel startup shows novel approach to data management

Cat in a box, image via Shutterstock
Unlike this little fella

Analysis Reduxio's array is a hybrid one with clever dedupe, and is restricted to an iSCSI interface, so let's move on. Actually let's not. Because under the covers something remarkable is going on.

Co-founder and CTO Nir Peleg explained this to us press hacks at Reduxio's Israeli HQ in Petach Tikvah yesterday. At heart, he said, client apps access persistent storage to write (store) data using tags such as file name or object id or block address, and then read the data using the same tags, or additional ones returned during the store operation.

In theory, one block of data could be accessed by different tags, the tags being just a way of getting to a place in a storage space, and there can be more than one way of getting to a place.

So suppose we had a big bucket containing all the world's data. Each unique chunk of data is stored in the bucket once and has a unique name. We could then add tags outside the bucket, referring to data chunks already in the bucket by their names.

Tags are (implicitly or explicitly) supplied by clients and a data chunk may have multiple tags in different contexts. For example, an identical data block in different volumes and in different offsets.

Reduxio_daring_stupidity

Incremental storage technology evolution has failed

Transfer this idea to a storage repository (bucket) of some sort, and imagine it is populated with some set of individually unique data chunks. A new chunk comes along to be stored and the controlling system checks if there is an identical one in the array. If so, use the name of the chunk in the array. Otherwise, write the chunk into the bucket and give it a new, unique name that will henceforth be used for that chunk.

Deduplication and backdating

There is a database/index/pointer system which, given the name of a data chunk, returns its address in the bucket. So the data chunk can be moved inside the bucket, say from one tier of media to another, and its address in the bucket changes but its name does not. It can change media type and location without affecting storage objects from which it is referred.

Such a system inherently deduplicates data; only unique data chunks are stored.

Nir_Peleg

Reduxio co-founder and CTO Nir Peleg

If a timestamp is added to each chunk when it is written to the bucket, then the set of chunks could be restored to a previous time. We don't need snapshots or other point-in-time copies of the data, or a journalled file system. We can access past data in a single transaction resolution, with zero management overhead.

All of the above can be largely implemented using key:value stores, and Reduxio has done this with its system, using the term "keystore." Keystore operations using flash media and modern X86 processors are fast enough to deliver a quick, not to say lightning-fast data store.

The overall bucket can be implemented as a keystore instance, keeping chunk names and physical locations (think metadata store). Storage objects (files/volumes/objects) can be represented by keystore instances keeping (client supplied, explicit or implicit) tags with data chunk names (think data store). Both could be distributed among multiple nodes due to the inherent structure of the keystore.

Migration and recovery

This facilitiates data migration as the upper-level metadata store, being small, could be sent to a destination node quickly. Then the data store could be streamed with each data object sent over when it is requested by a user on the destination system. The initial access would be across the network but subsequent ones could be local, like loading a cache.

Reduxio supplied a system to the Barnstable Police Department in the USA. That department was hit by a ransomware attack whch started encrypting all the data, an enjoyable irony. But the encryption was stopped and the data rolled back to a few seconds before the attack started, using the timestamp filter, and all the data was recovered in minutes.

Looking ahead

There's much more to explore here and Reduxio's current support of iSCSI access to its arrays is obviously going to be expanded. We understand the general development roadmap may include:

  • Scale-out storage with set of buckets on different systems
  • Working on NVMeF
  • Looking at adding VVOLs
  • Looking to add more flash tiers
  • OpenStack Cinder driver in next OpenStack release and will be able to backdate Cinder volumes
  • REST API coming
  • Puppet support coming

We wouldn't be surprised if container support was coming.

There are more than 50 customers paying and in production, with another 20-plus customers in evaluation. It's channel sales with enablement by Reduxio's own sales team.

The company employs 70 people: 45 in Israel in R&D and engineering, and 25 in the USA. The global HQ is in San Francisco, and outbound marketing and sales in the USA are handled by seven offices there.

There have been two funding rounds totalling $30 million. Databases such a SQL Server, Oracle, MySQL, and SAP are supported.

We understand a leading Israeli hosting company moved from NetApp (10 years experience) to Reduxio, going from one rack to 3U and finding server performance improved dramatically. Restore to the second works and it's getting 3:1 data reduction.

Check out Reduxio. You might be amazed by what you find there. ®




Biting the hand that feeds IT © 1998–2019