Feeds

Is HP pulling a fast one on deduplication?

EMC demands recount

Security for virtualized datacentres

Is HP pulling the deduplication wool over our eyes by claiming its dedupe box can run at 100TB/hour while EMC's best rate is 31TB/hour? Should a 4-pool dedupe system realistically be compared to single pool design?

Yesterday at the HP Discover event in Las Vegas, the company announced its Store Once Catalyst software and B6200 disk array combination could ingest data at up to 100TB/hour while Data Domain's 990, announced just two weeks ago, ingests it at 31TB/hour. Ergo, HP is three times faster - and Data Domain sucks.

But the B6200 is actually formed from couplet building blocks: two controllers or nodes in a high-availability configuration with their own storage and a single deduplication index.

It expands to an 8-node system by aggregating four couplets in a cluster, using a 10GigE interconnect and Fusion Manager to control the 8 nodes/4 couplets as a single system with a single namespace, but four separate deduplication indices.

There is no global deduplication across a B6200 cluster.

HP B6200

HP B6200

EMC's Mark Twomey, technical director in the office of the CTO for Backup Recovery Systems, told us: "I don't get how HP can call it scale-out when those are four separate dedupe pools. That [100TB/hour] number is from four 2-node systems, isn't it? Yes they have one manager, but it's still four systems. If I get a manager can I compare four Data Domains?"

The B6200/Store Once's speed per deduplication index or realm is 25TB/hour. With the Catalyst software, which gets 60TB/hour of dedupe done on the source servers leaving 40TB/hour for the B6200, it is 10TB/couplet and 5TB per node.

A Data Domain 990 runs at 15TB/hour when Boost is taken out of the equation. Its raw dedupe speed is faster than that of a B6200 couplet and there is a single dedupe index. Ergo, based on a single dedupe index Data Domain's 990 is 50 per cent faster than a base B6200 configuration. Ergo, HP sucks.

El Reg suspects this difference is because the B6200 uses an older Intel processor than the newer DD 990.

HP marketing veep Craig Nunes says an 8-node B6200 is a single system because it is managed as one and has a single namespace. The single namespace is segmented into four individual namespaces, one per couplet, and, he says, "next year I could do a firmware update and change that".

Pooling resources

Will the B6200 get a global deduplication pool next year then? Nunes declined to comment.

Interestingly, the Sepaton S2100 ES2 deduplicating system that HP resells is (like the B6200) an 8-node system, supports Symantec's OST interface, and runs at a 43.2TB/hour ingest rate into a global deduplication pool.

That global pool probably means that the ES2 dedupes more effectively than HP Store Once. Also, the ES2 is a bit long in the tooth and is likely to get a speed bump via a processor refresh.

This global dedupe capability across an ES2 cluster should ensure the HP Sepaton reselling relationship remains in place, at least until the B6200 gets its own global dedupe capability. When and if that happens then characterising an 8-node B6200 cluster as a single deduplicating system will be more legitimate.

In the meantime it is justifiable to define the B6200 as a single system so far as management and overall name space is concerned, But HP is stretching the point to call it a single deduplicating system when there are four separate deduplication realms inside. ®

Comments to this forum topic please.

Remote control for virtualized desktops

More from The Register

next story
Just don't blame Bono! Apple iTunes music sales PLUMMET
Cupertino revenue hit by cheapo downloads, says report
The DRUGSTORES DON'T WORK, CVS makes IT WORSE ... for Apple Pay
Goog Wallet apparently also spurned in NFC lockdown
IBM, backing away from hardware? NEVER!
Don't be so sure, so-surers
Hey - who wants 4.8 TERABYTES almost AS FAST AS MEMORY?
China's Memblaze says they've got it in PCIe. Yow
Microsoft brings the CLOUD that GOES ON FOREVER
Sky's the limit with unrestricted space in the cloud
This time it's SO REAL: Overcoming the open-source orgasm myth with TODO
If the web giants need it to work, hey, maybe it'll work
'ANYTHING BUT STABLE' Netflix suffers BIG Europe-wide outage
Friday night LIVE? Nope. The only thing streaming are tears down my face
Google roolz! Nest buys Revolv, KILLS new sales of home hub
Take my temperature, I'm feeling a little bit dizzy
Storage array giants can use Azure to evacuate their back ends
Site Recovery can help to move snapshots around
prev story

Whitepapers

Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Top 5 reasons to deploy VMware with Tegile
Data demand and the rise of virtualization is challenging IT teams to deliver storage performance, scalability and capacity that can keep up, while maximizing efficiency.
Protecting against web application threats using SSL
SSL encryption can protect server‐to‐server communications, client devices, cloud resources, and other endpoints in order to help prevent the risk of data loss and losing customer trust.