This article is more than 1 year old

The man who found Atlantis: 14PB of storage, flashy models and Moore's Law

Chetan Venkatesh talks business

Give us the lowdown, how does your product work?

WTH: On a technical level, say I have an application running on a host and it sends a write towards the memory. How does the write get replicated and when does it get sent to the local storage, SAN or NAS?

CT: We build an in-memory de-duplication engine, so everything we do for replication is related to de-duplication. We exploit the de-duplication first to minimise the amount of data that needs to be replicated. There are at least 3 copies that are placed within a cluster, in addition it will take metadata belonging to the de-duplication cases and put that in as many places as possible.

Say there is an eight node cluster. When you have three nodes available you are fully protected and have complete information about everything available. Additionally, the remaining five nodes will be used to place metadata copies so that you get better redundancy and parallel performance.

A lot of the data placement is defined by the type of storage volume you want to create. Depending on the storage model you have chosen, the software is able to take physical storage assets and create a storage pool out of that. It does the same with RAM.

WTH: How does the write sequence differ between the hybrid and all-flash storage model?

CT: The situation depends on the storage model that the customer selected. For the Hybrid model, the acknowledgements is not done based on whether the write got sent to the physical layer; it relies on a defined minimum number of three nodes to acknowledge they received the information. As soon as the minimum number of nodes have acknowledged, it is done, so that USX does not have to wait for the underlying storage.

That way you can abstract the storage underneath from a performance standpoint and make use of the memory performance, while maintain good data protection.

If we’re using the all-flash model, then essentially it waits for the flash tier to acknowledge before USX can acknowledge back to the application. Based on the type of storage model you use, you have a very fine model to affect write acknowledgement.

WTH: So using the all-flash model, USX will wait for the underlying all-flash array to acknowledge, instead of waiting for peers left and right of the node?

CT: Yes. In all-flash you actually have two models. The first is when you have an all-flash array in the back like a fast Pure Storage or Violin Memory array. In that case it is very simple, we’re going to de-duplicate, compress the data and dump it on the all-flash array and acknowledge it as soon as the flash tier has accepted it. This is actually very fast, the latencies are very low on this.

In the second all-flash model, you have very fast locally installed PCI-E connected flash or Diablo Technologies MCS. In that case it reverses back to the cluster model where as long as a number of minimum number of nodes have acknowledged they have received the I/O It will acknowledge the I/O back.

It's not very non-techie friendly, is it

WTH: Ok, so depending on the precise storage configuration there are various write-acknowledgement policies available. That could get complicated.

CT: The main take-away for customers is that they don’t need to worry about the details, the policies adapt themselves to the situation. Your data is always protected.

WTH: I could imagine customers want to speak with Atlantis consultants before they decide on which physical storage configuration to use and which USX storage model to run.

CT: The rule of thumb between a mode is really that if you need a mix between performance and capacity then build a hybrid. If you don’t care about capacity, then use in-memory mode. If you want to use fast mode of persistent storage, then use the all-flash mode. We have written down a lot around these models that customers can use.

More about

TIP US OFF

Send us news


Other stories you might like