Original URL: https://www.theregister.com/2011/08/16/permabit_albireo_speedup/

Permabit goes into dedupe hyperdrive

'Primary, secondary, any-ary - it rocks'

By Chris Mellor

Posted in Storage, 16th August 2011 08:00 GMT

Permabit has upped up its Albireo technology's deduplication speed by 250 per cent, reaching 400GB/sec. This is compared to the 77GB/sec recorded in late 2010.

The tech company claims that its Albireo technology is the only tech suitable for deduplicating primary storage, as it has no impact on primary data access performance. The technology is sold as an SDK for OEMs to license and use. So far BlueArc and Xiotech are known to have licensed it, with NetApp suspected of having a licence also, courtesy of its purchase of Engenio.

Permabit says Albireo has a 2-stage indexing technology using both "dense and sparse indexing techniques". The server or disk array controller running the code needs 1GB of RAM for every petabyte of disk storage, and the thing works with 20PB of disk storage in a grid configuration.

The speed claims stand up. Data Domain's DD890 deduplicates at 14.7TB/hour with Boost: that means 4.08GB/sec. A 2-node DD890 cluster runs nearly twice as fast: at 7.31GB/sec. Permabit's fourth generation of its Albireo deduplication technology is almost 100 times faster than a single Data Domain DD890 product.

ExaGrid's EX13000E in a 10-node grid deduplicates at 24TB/hour – or 6.666GB/sec. Single node Albireo is more efficient than that.

ESG has validated Permabit's claims, with Steve Duplessie, founder and senior analyst, saying: "I don't see any other real alternatives for OEMs to be able to quickly get to market with lightning fast dedupe capabilities for primary, secondary, or really any-dary storage. Albireo rocks."

The surprise is that so few disk storage array OEMs are known to have licensed the Albireo code, given that the performance is so good. With ESG validating Permabit's technology we can believe the claims, and it may be that OEM qualification for primary data applications are just taking a very long time.

Perhaps a secondary consideration is that adopting Albireo for primary data dedupe will cause an incompatibility with existing secondary deduplication technologies used by, for example, EMC Data Domain, Dell, HP, IBM and others.

Albireo is a fascinating and promising technology. We're really looking forward to BlueArc and Xiotech products that use it. Are they going to be significantly more cost-effective in $/GB storage costs compared to non-deduping primary storage arrays? That's the tantalising promise. Come on guys, ship product – we're all in suspense here. ®