Feeds

Dell's dedupe story still unfolding

Ocarina rocket scientists look to crack the block

Boost IT visibility and business value

Comment Dell's spreading of Ocarina dedupe goodness across its storage platforms has to overcome several obstacles, none of which are show-stoppers.

Ocarina deduplicates files – or more accurately, optimises and compresses files – and is content-aware so that it can work its specific magic on JPEGs and cast different spells on PACS images. It does not dedupe blocks and therein lies the rub.

Dell wants to layer its Ocarina dedupe across its storage estate at both file and block level, and not exclude the Windows server-based DL disk-to-disk backup systems that run either CommVault or Symantec software. Darren Thomas, Dell storage VP and general manager, is confident this is all possible.

File dedupe

The file need is easiest, being Ocarina's home ground so to speak. Dell has already announced its Dell Scalable Filesystem (DSFS) for PowerVault with the NX3500, and for EqualLogic with the FS7500, in the form of NAS heads. One is coming for Compellent and another for the DX6000, the object storage box. Dell will add the Ocarina dedupe technology to that box and it immediately becomes available for all the storage platforms under the DSFS head; the timing is the same.

Thomas said: "Think of it like a RAID feature. It's a dedupe feature for a file system."

That does not include the DL arrays though. Dell understands that it is desirable that data, file data, that is being backed up from DSFS-headed arrays does not have to be rehydrated before being backed to the CommVault or Symantec DL products, but neither supplier's software supports Ocarina.

We understand that CommVault is thinking about how to import Ocarina-ised information and avoid the re-hydration. Symantec, according to our research, understands the desirability of avoiding the rehydration but is not so far advanced as CommVault in talking to Dell about the issue.

Darren Thomas said Dell represents 20 to 25 per cent of CommVault's revenues and we can imagine that this gives Dell people significant mindshare in, and fast-access to, senior people in CommVault. Such access is not so attainable with the much larger Symantec.

Darren Thomas said there is also a need for these two DL software suppliers to be able to export (restore) the backed-up data to other third-party systems, which are enabled to read the Ocarina-ised data. These systems would need to have Ocarina Reader software, which is a relatively small piece of code, as well as being well within Dell's power to distribute.

Block dedupe

Blocks are different. Blocks are difficult. The size of groups of blocks, chunks or pages, varies between the Dell storage platforms. On EqualLogic systems a block is 15MB whereas on Compellent's it varies and the 64-bit StorageCenter O/S will track at the block level. A block is not a complete file, although the storage O/S can in principle be queried about which blocks makeup which files. Having files striped across drives increases the fragmentation a block-level deduper of primary storage has to deal with.

The larger the page or chunk size the higher the probability of finding duplicated data within it.

Dell has its Ocarina dedupe rocket scientists working on this. These are the people who developed the original Ocarina algorithms for compressing data other deduplication technologies could not touch, the various image file formats for example. They are developing algorithms to find and remove duplicated data in the pages or chunks, and also to recover the released space. It's no good rewriting the 15MB page with 3MB of empty space in it. Darren Thomas said: "If you compress 15MB of data to 12MB then you have to recover the space. Maybe this will mean concatenating compressed pages."

As we understand it, you would read in pages, dedupe them, and then write them back to the disks as a continuous stream with the array software breaking up this stream up into pages again.

Once the deduplication detection and space recovery algorithms have been created, Thomas said: "We'll build it into the operating systems of EqualLogic and Compellent. At that point they become separate pieces of work."

The time frame for this effort is not clear-cut. Dell is confident that it can achieve the result it wants, and we think we should start seeing results in 12 months or so, if not before. El Reg thinks the file level Ocarina dedupe could start showing up by the end of the year.

You get the feeling Dell is very happy to have put the old days behind it, when people were asking if Dell was really an innovative company. It has a substantial chunk of its own IP and is energetically developing it. Soon no doubt, we'll be hearing about Dell's patent portfolio, and the research scientists at Ocarina will have contributed their part to it. ®

The essential guide to IT transformation

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Oracle reveals 32-core, 10 BEEELLION-transistor SPARC M7
New chip scales to 1024 cores, 8192 threads 64 TB RAM, at speeds over 3.6GHz
Docker kicks KVM's butt in IBM tests
Big Blue finds containers are speedy, but may not have much room to improve
US regulators OK sale of IBM's x86 server biz to Lenovo
Now all that remains is for gov't offices to ban the boxes
Gartner's Special Report: Should you believe the hype?
Enough hot air to carry a balloon to the Moon
Flash could be CHEAPER than SAS DISK? Come off it, NetApp
Stats analysis reckons we'll hit that point in just three years
prev story

Whitepapers

5 things you didn’t know about cloud backup
IT departments are embracing cloud backup, but there’s a lot you need to know before choosing a service provider. Learn all the critical things you need to know.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.
Rethinking backup and recovery in the modern data center
Combining intelligence, operational analytics, and automation to enable efficient, data-driven IT organizations using the HP ABR approach.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.