Feeds

Dell's dedupe story still unfolding

Ocarina rocket scientists look to crack the block

Top 5 reasons to deploy VMware with Tegile

Comment Dell's spreading of Ocarina dedupe goodness across its storage platforms has to overcome several obstacles, none of which are show-stoppers.

Ocarina deduplicates files – or more accurately, optimises and compresses files – and is content-aware so that it can work its specific magic on JPEGs and cast different spells on PACS images. It does not dedupe blocks and therein lies the rub.

Dell wants to layer its Ocarina dedupe across its storage estate at both file and block level, and not exclude the Windows server-based DL disk-to-disk backup systems that run either CommVault or Symantec software. Darren Thomas, Dell storage VP and general manager, is confident this is all possible.

File dedupe

The file need is easiest, being Ocarina's home ground so to speak. Dell has already announced its Dell Scalable Filesystem (DSFS) for PowerVault with the NX3500, and for EqualLogic with the FS7500, in the form of NAS heads. One is coming for Compellent and another for the DX6000, the object storage box. Dell will add the Ocarina dedupe technology to that box and it immediately becomes available for all the storage platforms under the DSFS head; the timing is the same.

Thomas said: "Think of it like a RAID feature. It's a dedupe feature for a file system."

That does not include the DL arrays though. Dell understands that it is desirable that data, file data, that is being backed up from DSFS-headed arrays does not have to be rehydrated before being backed to the CommVault or Symantec DL products, but neither supplier's software supports Ocarina.

We understand that CommVault is thinking about how to import Ocarina-ised information and avoid the re-hydration. Symantec, according to our research, understands the desirability of avoiding the rehydration but is not so far advanced as CommVault in talking to Dell about the issue.

Darren Thomas said Dell represents 20 to 25 per cent of CommVault's revenues and we can imagine that this gives Dell people significant mindshare in, and fast-access to, senior people in CommVault. Such access is not so attainable with the much larger Symantec.

Darren Thomas said there is also a need for these two DL software suppliers to be able to export (restore) the backed-up data to other third-party systems, which are enabled to read the Ocarina-ised data. These systems would need to have Ocarina Reader software, which is a relatively small piece of code, as well as being well within Dell's power to distribute.

Block dedupe

Blocks are different. Blocks are difficult. The size of groups of blocks, chunks or pages, varies between the Dell storage platforms. On EqualLogic systems a block is 15MB whereas on Compellent's it varies and the 64-bit StorageCenter O/S will track at the block level. A block is not a complete file, although the storage O/S can in principle be queried about which blocks makeup which files. Having files striped across drives increases the fragmentation a block-level deduper of primary storage has to deal with.

The larger the page or chunk size the higher the probability of finding duplicated data within it.

Dell has its Ocarina dedupe rocket scientists working on this. These are the people who developed the original Ocarina algorithms for compressing data other deduplication technologies could not touch, the various image file formats for example. They are developing algorithms to find and remove duplicated data in the pages or chunks, and also to recover the released space. It's no good rewriting the 15MB page with 3MB of empty space in it. Darren Thomas said: "If you compress 15MB of data to 12MB then you have to recover the space. Maybe this will mean concatenating compressed pages."

As we understand it, you would read in pages, dedupe them, and then write them back to the disks as a continuous stream with the array software breaking up this stream up into pages again.

Once the deduplication detection and space recovery algorithms have been created, Thomas said: "We'll build it into the operating systems of EqualLogic and Compellent. At that point they become separate pieces of work."

The time frame for this effort is not clear-cut. Dell is confident that it can achieve the result it wants, and we think we should start seeing results in 12 months or so, if not before. El Reg thinks the file level Ocarina dedupe could start showing up by the end of the year.

You get the feeling Dell is very happy to have put the old days behind it, when people were asking if Dell was really an innovative company. It has a substantial chunk of its own IP and is energetically developing it. Soon no doubt, we'll be hearing about Dell's patent portfolio, and the research scientists at Ocarina will have contributed their part to it. ®

Internet Security Threat Report 2014

More from The Register

next story
Just don't blame Bono! Apple iTunes music sales PLUMMET
Cupertino revenue hit by cheapo downloads, says report
The DRUGSTORES DON'T WORK, CVS makes IT WORSE ... for Apple Pay
Goog Wallet apparently also spurned in NFC lockdown
Hey - who wants 4.8 TERABYTES almost AS FAST AS MEMORY?
China's Memblaze says they've got it in PCIe. Yow
IBM, backing away from hardware? NEVER!
Don't be so sure, so-surers
Microsoft brings the CLOUD that GOES ON FOREVER
Sky's the limit with unrestricted space in the cloud
This time it's SO REAL: Overcoming the open-source orgasm myth with TODO
If the web giants need it to work, hey, maybe it'll work
'ANYTHING BUT STABLE' Netflix suffers BIG Europe-wide outage
Friday night LIVE? Nope. The only thing streaming are tears down my face
Google roolz! Nest buys Revolv, KILLS new sales of home hub
Take my temperature, I'm feeling a little bit dizzy
prev story

Whitepapers

Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.
Simplify SSL certificate management across the enterprise
Simple steps to take control of SSL across the enterprise, and recommendations for a management platform for full visibility and single-point of control for these Certificates.