Ex-Sun Micro CTO reveals Greenbytes 'world-beating' dedupe
He IS on advisory board of flash cache latency smasher, but...
El Reg has managed to take a peek at an as-yet-unpublished white paper, written by former Sun Microsystems CTO Randall Chalfant, which claims the storage company's deduplication tech has near-zero latency and possibly offers the world's fastest inline deduplication.
It works with 4K blocks of data. This is the sequence of events once data comes in to a Greenbytes system, we're told by Chalfant – a lecturer who sits on the company's industry advisory board:
- The Greenbytes system receives the data and it is stored in a write log, which exists on one or more solid-state drives, and then a write acknowledgement is returned to the client so its OS and application can carry on and not have to wait for the data write to complete.
- Greenbytes' software specifies three stages of data input: the open stage, the quiescing stage [rendering inactive], and the synching stage. In the open stage, the client is free to write as much data as possible into the memory buffer. Then, every few seconds, a snapshot is taken to freeze and then quiesce [disable] the buffer, which is done in preparation for writing data to disk. During the synch stage, the a 256-bit hash is computed for each data block.
- The hash is stored in a d-cache (deduplication cache), an assembly of one or more solid-state drives that can be extended easily over time to increase the size of the storage system. The d-cache only holds the dedupe search tables and has a fixed data access latency.
- GreenBytes' technology determines in a virtually constant time whether there is a block match in the storage system using the hashes. It calls the look-up algorithm it uses its probabilistic constant time search.
- Computed hashes are looked up to see if they exist already in the d-cache. The d-cache returns an answer in constant time, and, if there is no match, a new 4k block of data is written to storage. If there is a match a pointer gets written instead.
To add more detail here, Greenbytes' CTO and founder Bob Petrocelli says:
The width of the hash is actually tunable. We currently allow 128 bit,192 bit and 256 bit hashes. The default is 256 bits. One of the patent claims deals with the searching approach using hashes. ...
The important point is to realise that the write-coalescing and the actual determination of which blocks to write happens during the transactional phase of the pipeline. There are a lot of complex considerations during this phase.
For example a block that is overwritten many times, say by an application log etc, will only be written once, [using the] final state of the block as all blocks writes are collapsed). A temporary in memory AVL tree is used for this write coalescing.
The system is zero latency because we are able to back the write immediately, protected by the intent log, and then only later during the transactional phase do we absorb the cost of de-duplication. When we have any duplicate data in the stream, we come out ahead of the game because we end up committing less data to disk.
CEO and chairman Steve O'Donnell added: "The searching process uses small parts (64 bits) of the hash to rapidly determine the likelihood of the need to write a block, this dramatically reduces the amount of RAM needed to store the hash and enables the tiny footprint that vIO [Virtual Desktop software] uses inside the Hypervisor."
Greenbytes has protected its dedupe software with many patents and will defend its patents using legal eagles. In fact it has already done so.
Back in 2009, Sun Microsystems, which at the time was being acquired by Oracle, sued Greenbytes for infringing its deduplication patents, after Greenbytes claimed Sun had used Greenbytes' own deduplication scheme.
By 2010 this legal dispute was settled and Greenbytes continued to sell and develop its deduplication technology.
The Oracle/Sun ZFS deduplication technology seems not to have been much developed since then. ®
Sponsored: Benefits from the lessons learned in HPC