BitTorrent awarded distributed storage patent
Pirate favourite now RAIDing the cloud
BitTorrent has been awarded a patent for something called “Distributed storage of recoverable data”.
Available here, the patent is described as “A system, method, and computer program product replace a failed node storing data relating to a portion of a data file.”
The invention seems to resemble something an awful lot like RAID storage for resources located on different bits of a wide area network or, if you will, a cloud.
Let's step through the patent, beginning with its explanation of related art, to whit:
“A central problem in storage is how to store data redundantly, so that even if a particular piece of storage fails, the data will be recoverable from other sources. One scheme is to simply store multiple copies of everything. While that works, it requires considerably more storage for a particular level of reliability (or, contrapositively, it provides considerably less reliability for a particular amount of storage).”
Nothing to tax a storage admin's mind there, nor in the next bit:
“To achieve better reliability, erasure codes can be used. An erasure code takes an original piece of data and generates what are called 'shares' from it. Shares are designed so that as long as there are enough shares that their combined size is the same as the size of the original data, the original data can be reconstructed from them.”
BitTorrent's scheme is to create a “tracker” that knows where each share is stored and, if a share is erased, to copy data from other locations that hold the same data to restore the desired level of distributed redundancy.
“The available storage nodes each contain a plurality of shares generated from a data file,” the patent's abstract says. “These shares may have been generated based on pieces of the data file using erasure coding techniques. A replacement share is generated at each of the plurality of available storage nodes. The replacement shares are generated by creating a linear combination of the shares at each node using random coefficients. The generated replacement shares are then sent from the plurality of storage nodes to the indicated new storage node. These replacement shares may later be used to reconstruct the data file.”
There's a lot of detail that goes into the reconstruction but you probably get the idea by now. You may also be thinking that sounds too good to be true, and you're right because the patent also says “The above technique faces limitations when used for distributed storage over the Internet. For Internet storage, the scarce resource is bandwidth, and the storage capacity of the end nodes is essentially infinite (or at least cheap enough to not be a limiting factor), resulting in a situation where the limiting factor on any storage is the amount of bandwidth to send it.”
BitTorrrent says it has found a way to overcome that problem with a scheme that behaves an awful lot like, well, BitTorrent.
Just what BitTorrent plans to do with the software is anyone's guess. The company has tried to go “straighter” over the years, with products like secure messaging, ”bundles” and a share 'n' sync tool.
Perhaps a storage application is in the works? BitTorrent is occasionally used as a file distribution method by makers of commercial software, but it's hard to see business users queueing up to buy “Backup Software Brand X – Powered By BitTorrent.”
BitTorrent is not the only outfit keen on erasure codes: Singaporean researchers are trying to put them to work, while they're also a key part of RAID 6. ®