Original URL: http://www.theregister.co.uk/2010/06/25/emc_viper/

EMC gets busy with dedupe, compression code-base

Team Viper has awesome name, dude

By Chris Mellor

Posted in The Channel, 25th June 2010 14:56 GMT

EMC has been developing its own deduplication and compression code-base, entrusting the effort to a team code-named Viper.

The story, as told by someone familiar with the events, starts with EMC acquiring Avamar in November 2006. In June 2009 it acquired DataDomain. At this point it had Avamar source deduplication, DataDomain target deduplication, and RecoverPoint single instancing and compression.

The Viper team's remit was to deconstruct the code base in each of these three technologies and then build a code-base, an all-EMC code-base, that could run in these products and on the hardware running them, such as CLARiiON and Celerra for RecoverPoint. It would enable a progressive reduction in data size as data moved throughout its lifecycle and from product to product.

It might be possible to have data more or less permanently stored in a reduced form until its regeneration, rehydration in dedupe industry-speak, was needed by an application or user.

If a file hit Avamar and had come from RecoverPoint then it would already be a single instance and compressed, so Avamar would have no need to compress it and could focus on the deduplication.

The Viper team was set up in 2009 and is still in operation. We understand from a second source that the team has written code as a component which is being used in Celerra for deduplication and FLARE for compression. Further, this code component can be used by any other product team in EMC which needs compression or deduplication.

All this implies that EMC would have no interest in taking embeddable deduplication source code from either Ocarina or Permabit. We understand Compellent is not taking code from these vendors either as it develops deduplication. ®