HP shows off filer and dedupe monsters in Vienna
Powerful combos of scale-out hardware and filer software
Magic Quadrant for Enterprise Backup/Recovery
HP used its Discover event in Vienna to both broaden and deepen its core storage portfolio, strengthening its file and deduplication offerings to compete better with EMC and NetApp.
The headlines are:
- High-availability B6200 StoreOnce Backup System deduplicating backup to disk system for enterprises;
- Data Protector backup product gets StoreOnce deduplication integrated; and
- X5000 Windows-based NAS released for mid-sized business.
The company has significantly strengthened both its filer and deduplication offerings and is emphasising scale-out hardware capabilities to cover a wider range of customer needs.
B6200 enterprise deduplicating backup product
The big news is the promotion of StoreOnce to the enterprise big time from the mid-range area with the B6200 product. StoreOnce is an HP Labs-developed deduplication technology.

HP B6200- deduplicating array
The B6200, a virtual tape library or straight backup to disk target, has from 48TB to 768TB of raw capacity (32TB – 512TB usable), using 2TB SAS drives, which HP contrasts to the 384TB of EMC Data Domain's 890, and has a scale-out architecture supporting up to eight nodes. There is both Ethernet and Fibre Channel host connectivity.
It can deliver a claimed 20:1 deduplication ratio or even higher but, of course, your mileage may vary. HP says "this lowers storage capacity requirements by up to 95 per cent and delivers rapid data recovery". It says its effective storage capacity is up to 10PB.
It can back up data 3.5 times faster than Data Domain and restore data at a 28TB/hour rate, claimed to be three times faster than Data Domain.
EMC staffer Mark Twomey said: "HP's StoreOnce numbers are for an 8-node system with four dedupe pools versus the two node [Data Domain 890] GDA with one global dedupe pool." Let competition be joined.
B6200 features include PredictiveAcceleration that uses "intelligent container matching technologies to accelerate data analysis." A Rapid Restore feature provides "an optimised data layout that enables files to be restored at 100 per cent of backup throughput, up to six times faster than alternative," according to an Evaluator Group report.
It also has Adaptive Micro-Chunking with which "the industry’s most efficient deduplication engine ... dynamically adjusts the size of matched data blocks averaging 4 kilobytes."
The B6200 has an autonomic restart feature as part of its high availability functionality, with HP pointing out that other deduplication systems require a manual reboot if they fail, which can take longer, and requires more admin resource.
The high availability features include built-in hardware redundancy with dual path disk arrays, a dual path internal network, dual power supplies, and hardware-based RAID 6. There is also dual fabric support via bonded Ethernet connections and dual Fibre Channel ports per node.
It supports more remote sites than Data Domain's 890; its maximum remote site fan-in is 384. HP says the Data Domain GDA's maximum remote fan-in is 270, 40 per cent less.
We could ask where all this leaves Sepaton, whose S2100 product HP OEMs as its high-end deduplication product – and which has, according to the Taneja Group, the only database-optimised deduplication in the industry. It seems fairly clear that the B6200 and the direction of StoreOnce could threaten Sepaton's position in HP's portfolio. Storage supremo David Scott said little to disabuse us of this thought.
Next page: Deduping Data Protector
COMMENTS
Great example
I never thought of data in this aspect but I think you hit the nail on the head.
It is putting a band-aid on a bigger problem.
I think data manipulation is the hardest of all task as a system admin.
"true, working solution" ...
... tends to be what the customer has already.
They're just short of storage. Hence trying to get more storage for less-more money ...
I'm in full agreement with you that in this there's shortsightedness and an a-priori approach to application / workload design which structures data and avoids "copy&paste-referencing/subclassing" can easily bring down storage / bandwidth needs by orders of magnitudes.
Unfortunately, many software stacks are "working" but are old and rigid; retrofitting a profound architectural change such as this into existing software is, not always but very often, either so daunting or so expensive as to be prohibitive.
Structured data, in that sense, is not necessarily using less storage / does not necessarily dedupe better. XML is a curse, really; copy & paste an XML file into another shifting it around by a few bytes in the process, and the dedup potential is gone. The usually-identical console logs from a server bootup are preceeded by unique timestamps/hostnames and again, the dedup potential evaporates. Just as examples.
These problems notwithstanding, storage that compresses and/or deduplicates (if only the twenty copies of the renamed CEO powerpoint memo which got stored into the DMS by twenty different departments) provides savings, and therefore has its place.
These savings are not as great as the ones realizable from a "context switch", but very tangible and achievable at significantly less risk. Like, treat a cold with lots of camomile tea instead of a 1000$/dose not-yet FDA certified breakthrough antiviral medication with as-yet-unknown side effects. Treat symptoms not cause. One of those cases of "good enough" ?
end of life for the mechanical drive
It would be one of the biggest achievements once the technology/money is available to end the life of the mechanical hard-drive.
It is the Achilles heel of computing, being the slowest part of the equation.
With the economy floundering most companies are not investing much on R & D unfortunately.
The sustained I/O of solid memory is unparallelled in performance and speed. This will revolutionize IT eventually like a new awakening, it is as if mechanical drives will never meet an end. It is like dial-up Internet services it is time for it to be eliminated.

IT infrastructure monitoring strategies
Requirements Checklist for Choosing a Cloud Backup and Recovery Service Provider
Data control in the cloud
Cloud based data management
Enabling efficient data center monitoring