Feeds

Dell and the dedupe appliance conundrum

What's going on?

Beginner's guide to SSL certificates

Comment Dell is announcing a new deduplication appliance on Monday, the DL2000, while simultaneously saying dedupe will move on from a backup and appliance focus to something broader and more pervasive. What's going on?

Let's essay an attempt to join up some dots in this release with previous Dell statements - and with EMC's statements and activities - and see where it leads us.

The DL2000 release said: "Dell’s deduplication strategy is to foster and encourage the rapid evolution of dedupe technology into a storage environment where the functionality exists everywhere. As deduplication matures quickly, it will move beyond backup storage – where it primarily resides today - to other data types including near-primary, archive, file, and object storage solutions."

The DL2000 is an integrated deduping backup appliance and so represents, we might say, first generation deduplication. The second generation will encompass, in Dell's view, deduplication of near-primary data, archive data, file, and object storage.

Considering that Dell storage, apart from the EqualLogic PS6000 products, comes from EMC, we can note that near-primary storage refers to general, unstructured information stored on primary or tier 1 (Fibre Channel) storage, meaning Clariion CX in Dell/EMC terms. We can also note that the general EMC meaning of file storage is a filer, network-attached storage (NAS). Dell sells EMC's Celerra NX4 NAS product. The general meaning of object storage in EMC terms is Centera, which Dell does not sell.

The implication here is that Dell will provide deduplication for Clariion CX near-primary storage and Celerra NX4 filer storage. But here's a question. How can Dell provide object storage deduplication when it doesn't supply an object storage product? Is there a hint here that Dell is going to take and sell EMC's Centera product?

Let's return to the DL2000 release for a moment and find another telling statement: "Dell believes that incorporating deduplication functionality into ISV, application and storage software can provide significant benefits to customers."

What does this mean? To me it says that Dell will supply deduplication software technology that can be incorporated into ISV software, into application software and into storage software. This seems to be another aspect of second generation deduplication.

Follow the implication here and imagine a piece of application software and its stored data, either on directly-attached storage (DAS) or on a storage array. The application has code that deduplicates its stored data. Okay, where is that done? On the server running the application and storing data on DAS, or on the storage array, if it uses one, using its controller CPU cycles?

We have specific deduplication products now because dedupe is CPU and disk-intensive, and running it on general servers could cripple them. Okay, but Nehalem servers are coming and server virtualisation is here already so couldn't we find the CPU cycles to do that on the servers?

Alternatively, we could parse "storage software" to mean storage array controller software and have the dedupe execute on the array, using spare CPU cycles - they're tending to be multi-core Xeon controllers now - with the ISV and application software telling the array what to dedupe and where to place the deduped data.

That chimes in with ideas of multi-tiering inside arrays. So far no-one is talking much about multi-tiered DAS but we can suppose that might come. The situation seems easier to understand with networked storage than it is with DAS, since DAS implies application server CPU cycles are used in deduping which seems a poor use of server CPU cycles.

Internet Security Threat Report 2014

More from The Register

next story
The cloud that goes puff: Seagate Central home NAS woes
4TB of home storage is great, until you wake up to a dead device
Azure TITSUP caused by INFINITE LOOP
Fat fingered geo-block kept Aussies in the dark
You think the CLOUD's insecure? It's BETTER than UK.GOV's DATA CENTRES
We don't even know where some of them ARE – Maude
Intel offers ingenious piece of 10TB 3D NAND chippery
The race for next generation flash capacity now on
Want to STUFF Facebook with blatant ADVERTISING? Fine! But you must PAY
Pony up or push off, Zuck tells social marketeers
Oi, Europe! Tell US feds to GTFO of our servers, say Microsoft and pals
By writing a really angry letter about how it's harming our cloud business, ta
SAVE ME, NASA system builder, from my DEAD WORKSTATION
Anal-retentive hardware nerd in paws-on workstation crisis
prev story

Whitepapers

Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
The Heartbleed Bug: how to protect your business with Symantec
What happens when the next Heartbleed (or worse) comes along, and what can you do to weather another chapter in an all-too-familiar string of debilitating attacks?
Top 5 reasons to deploy VMware with Tegile
Data demand and the rise of virtualization is challenging IT teams to deliver storage performance, scalability and capacity that can keep up, while maximizing efficiency.