Data Domain gives itself DD Boost
Storage with extra cleavage
EMC's Data Domain has got itself DD Boost software which pre-processes backup data on a media server to increase deduplication speed by up to 50 per cent.
The software is installed on a media server and is integrated with either Symantec NetBackup or Backup Exec. It is a library and is used to identify segments in incoming data. The segment IDs are checked with the connected Data Domain array which says which ones are new. These are compressed on the media server and sent over the wire, reducing both local network traffic and the overall transfer time for the data.
Data Domain says that overall resource use on the media server is reduced by 20 to 40 per cent because of a reduced data copy overhead. It also says the aggregate backup throughput on its arrays increase by up to half as much again. Its DD880 is now rated at 8.8TB/hour because of DD Boost, up from the initial 5.4B/hour.
EMC will add DD Boost support to its own NetWorker backup software and expects to get the same benefits.
This is bad news for other backup software vendors. Even if they support Data Domain hardware their performance will suck compared to Backup Exec, NetBackup and the coming refreshed NetWorker. There is no mention in the EMC release of an open API for the media server software/DD Boost interface and no mention of a backup software supplier partner program.
Other deduplication array vendors, such as Quantum and Sepaton, now face another hurdle to jump, because their ability to boast of shorter backup times due, for example, to post-process dedupe just got reduced. In-line dedupers face the same problem; Data Domain has pressed the gas pedal and threatens to leave them behind.
Suppliers of disk-to-disk backup and Virtual Tape Libaries are in a similar bind. Life must be sweet for Data Domain boss Frank Slootman with this boost to his product set's appeal which screws the competition. ®
There's no free lunch
I can’t speak for all the competition for Data Domain, but the introduction of DD Boost hasn’t caused anyone at Quantum to jump off a bridge. It essentially amounts to a rebranding of support for OST, as an extension of their Global Deduplication Array. Reducing redundancy at the media server as DD Boost requires could reduce the load on the network and the target, but it means that the media server is doing more. So the change is likely to mean customers will actually need a bigger media server – there’s no free lunch. Quantum’s mid-range DXi6500 family sees a 25 to 40% performance increase when we use the OST interface without changing anything on the media server. Do you suppose Quantum might extend that capability to enterprise products in the near future? Watch.
Customers should also consider the complexity of a distributed solution like DD Boost. It can be time-consuming to configure, may require hardware upgrades, and troubleshooting can become extremely cumbersome. The simplicity of DXi, whether with OST or not, provides increased speed and better economic value without the extra complexity of distributed processing.
EMC’s Brian Biles said in the DD Boost press release: “There has been no significant change in roles between backup clients, backup servers and target storage in the last 20 years of traditional backup software deployment architectures, until right now…DD Boost literally thinks outside the box.” Unfortunately this statement does not acknowledge significant advancements in the backup and storage industry. Symantec OST is one of those advancements, a technology on which DD Boost relies. Another is FalconStor’s Backup Accelerator, a technology that requires no software or load on the backup server, works with all backup applications, and accelerators backup and transfer of data to deduplication targets by 400 percent.
EMC / Data Domain is NOT faster than VTL
"Data Domain says that overall resource use on the media server is reduced by 20 to 40 per cent because of a reduced data copy overhead."
Data copy overhead? This is a patently stupid and nonsensical statement by EMC. There is no such thing as copy overhead - the whole backup process is a data movement process. The data moves through the media server whether it's raw or deduped. Deduplication on the media server creates ADDITIONAL overhead just as it does with the CommVault deduplication. This is basic backup physics. It's also the very same argument that the Avamar sales team at EMC claims when competing against CommVault's dedupe strategy. (They used to claim that you had to reduce data on the host to gain efficiency).
Here are the basics:
The whole point of using a solution like this is to reduce your backup window... so why would you add an additional process that INCREASES your backup window?
Inline deduplication is no where close to being as efficient as going straight to VTL. Many enterprise customers have tested it and inline dedupe (whether from EMC/Data Domain or others) loses every time in head-to-head tests. Why? Because the inline process creates REAL overhead and slows down the backup job. EMC has been getting killed in Enterprise accounts for this (and for the lack of global dedupe and lack of FC connectivity... but that's a different story). So what did EMC do to fix the problem? They split up the dedupe process between the media server and the appliance with beefy processors. The improvement they get is not from the design, but from the chips... and even then it's not as fast as writing at full speed to a VTL. The design is still flawed.
Any process that moves data between point A and point B is fastest without ADDITIONAL processes (like dedupe) in the way. If your backup job takes X amount of time to complete when writing directly to a VTL, any additional processes (no matter the speed of the processor) will incrementally add to the time to complete your backup job.
It's the very worst combination of what CommVault does with media server dedupe combined with the very worst of data domain - inline dedupe.
This new version is a band aid to mask the problems of inline dedupe which is why EMC is touting improvement speeds without comparing it to the speed of writing to multiple VTLs with global dedupe. This is how they are losing in the enterprise. Its why they always test these for customers using small data sets. When the customer wants to test a real full backup set, they balk.