Dell scrambles onto de-dupe bandwagon
Quantum of revenue solace
Cloud storage: Lower cost and increase uptime
Dell has jumped on the Quantum/EMC de-dupe bandwagon and is developing a single block-level de-dupe and replication architecture spanning its own, Quantum and EMC storage arrays.
De-duplication involves detecting and removing repeated block-level patterns of data in a file and replacing them with pointers thus reducing the amount of disk capacity needed to store data. It is most effective with highly redundant data, such as backup files.
Using Quantum's technology, Dell is going to develop a single de-duplication architecture across its PowerVault, EqualLogic and Dell/EMC storage arrays. It will be able to replicate de-duped data between these arrays and, in theory, between them and Quantum and EMC storage arrays, across both LAN and WAN, using the Quantum de-dupe software as well. Replicating only unique byte-level changes reduces network bandwidth needs thus lowering disaster recovery costs.
Common management technology across the de-duping Dell, EMC and Quantum products is promised and Dell will also be providing de-duplication services as part of its service offering.
Robin Kuepers, Dell's storage marketing head, blew the Dell simplification trumpet, stating: “We’re going to do de-dupe differently – by putting the customers’ need for simplification up front. Storage has been too complicated for too long.”
Quantum acquired de-duplication software technology when it bought ADIC and has since licensed the software to EMC which uses it in its DL3D products.
Data Domain's de-dupe occurs when data is ingested. Typically though, de-duplication takes place after data has been ingested, so-called post-processing de-dupe. A unique aspect of the Quantum de-duplication technology is that it can be run either at ingest time or as a post-process so as to shorten backup duration when time is limited. Dell inherits this. NetApp will add the capability to its VTL de-dupe next year.
De-duplication is becoming a standard feature of drive arrays as it spreads out from its disk backup and virtual tape library (VTL) base. NetApp has its ASIS de-dupe built into its mainstream ONTAP operating system, positioning de-dupe as being appropriate for virtually any data except access time-sensitive transaction data. Market leader Data Domain is positioning its de-duplicating drive arrays as being suitable for more than just backup data. However, Dell's release implies that backup data is the main target for its coming de-dupe products.
Dell recently announced its DL2000 disk backup system with less effective file-level de-duplication from CommVault's Simpana. This will get block-level de-duplication some time next year. The logic of a single de-duplication architecture spanning Dell storage would suggest that this will be replaced by Dell's own de-duplication product.
However, Dell refers to its three storage product lines as TierDisk. The DL2000 appears not to be part of this TierDisk offering, implying it could continue using the incompatible Simpana product. Dell was not able to comment on this point at time of publication.
The company expects to begin shipping systems for customers ranging from small-and-medium businesses to large enterprises early next year. ®
COMMENTS
Doubleplusgood
I used to think I was... not stupid... until I read this article. Is this like non lossy JPG? surely squashing the data down further and further until it reaches a quantum singularity will drag everyones' data into a binary black hole. I can see the Reg headline now:
Cloud computing disappears up its own backside.
That's it, I'm off to rebuild a PC (faster, stronger, better - we can rebuild you), and make myself feel better.
c:\windows\
Eddie,
"And, conversely, if you want de-duped backup data you just run an incremental backup in the first place, without paying anyone a dime for de-dupe techonology."
where you're backing-up dozens or hundreds or thousands of machines/VMs including their OS system files, that still leaves a lot of duplicated data.
I would imagine you'd still want to keep multiple backups, maybe on different sets of disks/different locations..

IT infrastructure monitoring strategies
What you need to know about cloud backup
Enabling efficient data center monitoring
Agentless Backup is Not a Myth
Top 10 SIEM implementer’s checklist