Permabit's killer dedupe technology
Component dedupe for primary data storage OEMs
Scaling up and down
The amount of primary data to look at can be enormous, far larger than a single Albireo node can manage. Ivanov said: "We can provide a global index across multiple nodes and make use of memory in each node. It's transparent to the app whether we're a single local instance or distributed grid. We scale because of this multi-node scaling [and] we have self-healing capabilities.
"Petabyte scalability is key. Secondary storage deep only scales to a petabyte or more and cannot index very large amounts of unique data. … NetApp's A-SIS has a limit of 16TB per volume."
Albireo goes far beyond that. The indexing technology is the key to scaling in Ivanov's view and Albireo needs only 3.5 bytes in memory per index entry. He says it operates effectively in memory-constrained environments including SMB servers and small appliances, as well as in multi-node grid setups.
Ivanov says Albireo is fast: "It's at least ten times more efficient than anyone else. One in a thousand times we have to do a disk seek. Ninety nine per cent of index checks are in memory … We're delivering 140MB/sec per Xeon core or more … We can easily deliver thousands of MB/sec in a grid config." Watch out Data Domain.
Permabit sees this as OEM technology. In Ivanov's view: "The OEM has to do some development with our library but it ends up being far more attractive to them [than inventing it themselves or having an appliance] because we do not degrade performance or hide other functionality. The only competition is internal development."
Many OEMs have secondary data deduplication but: "Secondary data dedupe technology isn't necessarily best equipped to address primary space … dedupe is not dedupe when we target primary data market."
Ivanov said: "We're partnering up with larger and medium size storage OEMs to bring this to market. We've been working on it for six months and have some design wins. Some of them will possibly announce by the year end. We think there will be wide deployment in 2011. If your storage is not deduping in 2012 you will be at a disadvantage."
The core indexing technology was conceived by Jered Floyd, Permabit's co-founder and chief technology officer. The name Albireo comes from a star that looks like a single star to the naked eye but is revealed by a telescope to be a pair of yellow stars orbiting each other with a third and fainter blue companion. It's an analogy for deduplication: having multiple objects appear as one.
Will Albireo become a star? That depends upon a large number of storage OEMs taking it up, meaning companies such as 3PAR, Dell, EMC, HDS, HP, IBM, LSI, Pillar and so forth. Compellent is using Nexenta's ZFS and NetApp has A-SIS, so they're not on Permabit's list. It appears from our checks that 3PAR has no immediate plans to use Albireo. But would only take an EMC or HP or IBM to embrace Albireo for the floodgates to open with Permabit then becoming an effective industry standard.
If OEMs do adopt Albireo then as well as clawing back NetApp's lead it can be used for secondary data deduplication. That represents a threat to stand-alone deduplication vendors such as Exagrid, FalconStore, Ocarina and Sepaton.
Ocarina has existing OEM deals with BlueArc, HDS and HP. It is also pursuing an embedded strategy, and is launching an embedded product relatively soon. The company has multiple OEM contracts concerning this technology and many new products are under development with these OEMs. Ocarina will be talking about these partnerships and products in the next few weeks.
The prospects are terrific and Permabit is really hyped up over its new technology. Industry stardom, glamour and a future trouser pocket-filling IPO all beckon. Albireo sure is seeing stars, but will it become one? ®
Sponsored: Benefits from the lessons learned in HPC