Feeds

This is Frankfurt calling: Scattered outbreaks of hot crunchiness

A mixed grill storage smorgasbord newsfeast

  • alert
  • submit to reddit

Choosing a cloud hosting partner with confidence

SNW Europe This is the second installment of El Reg's coverage of StorageNetworkWorld Europe, aka Powering The Cloud, bringing you another smorgasbord of storage goodness from the biggest storage show in the old coutries. Some of it's hot, and some is even crunchy... so dip in.

BridgeSTOR and DDFS

BridgeSTOR CEO John Matze told us more about his DDFS - Data Deduplication File System - (background here) for tape and the cloud. A company statement reminds us of tape's centrality in data protection:

According to a December 2010 ESG brief titled: “NERSC – Success with Primary Data on Tape”, nearly half of the world’s data is stored on magnetic tape. Indeed, all 10 of the world’s 10 largest banks rely on tape storage for backup and archive data retention. The same can be said for each of the 10 largest telcos in the world; and eight of the 10 largest pharmaceutical firms.

Prior to BridgeSTOR, Matze wrote a first-generation dedupe product for Exar which produced deduplication hardware. But: "Exar cut the product and team in February 2012. It lost 40 per cent of its staff overall."

So DDFS is Matze dedupe mk II.

Matze sees his DDFS technology being most useful in backing up virtual machines. He says VMware VMs and VMDKS don't deduplicate well because they are misaligned to disk blocks. DDFS talks a 512K VM header and converts it to 4096 bytes. The 2MB data containers for the VM data is also put into 4096 byte blocks.

The BridgeSTOR technology is delivered as a virtual deduplication appliance in Windows VHD format called CRUNCH. It runs on existing backup servers or on servers where virtual machine images are stored and deduplicates all VM images - VMware, Hyper-V, and so on. VMs are "crunched" into deduplicated containers, the size of LTO tapes, which are then written to tape, with compression and encryption carried out by the tape drive.

A CRUNCH-written tape contains two files or containers essentially - one for data and the other for metadata. The tape has all the dedupe metadata on it.

CRUNCH Process Flow

BridgeSTOR CRUNCH process flow

BridgeSTOR has found a 20.6:1 data reduction ratio with VMs, turning 33GB of raw data into 16GB. Your mileage may vary of course.

In DDFS “block level” deduplication, blocks of data are “fingerprinted” using a hashing algorithm (SHA-1) that produces a unique, “shorthand” identifier for each data block. DDFS allows the Hash Table to be memory resident. The amount of memory required to hold the hash table is based on the amount of physical capacity being used and the deduplication block size.

DDFS is like a filter driver for Windows. Incoming data is broken up into consistent sized chunks and then processed. Metadata is written to one file and physical data is put into a container. Matze said: "I wrote it, just like I wrote the REO software for Overland Storage (where he was its CTO). Now my team is packaging it." BridgeSTOR is privately owned and there is no venture capital funding.

BridgeSTOR says: "When recovering data, DDFS enables restoring from an unlimited number of 'Recovery Points'. After the 'Initial Data Synchronisation' has been completed, DDFS will build and maintain a recovery 'map' that is based on the frequency of your data deduplication operations.

"For example, running DDFS (in a CRUNCH appliance, for instance) daily will result in the availability of multiple Recovery Points from which to recover data. As in a time machine, you can roll the clock back to a time when the data to be recovered was known to be correct and stable."

Matze's demo of DDFS and LTFS on a MacBook Air with an attached LTO-5 tape drive showed the MacBook Air user mounting LTFS with DDFS. Metadata is then moved to the local disk for cacheing. You can then peruse tape content without touching the drive. Matze starts the DDFS service, mounts DDFS, and, when a file is scanned, it's read off the tape drive.

DDFS can be used to write deduplicated files to tape or to send them to remote sites, including "the cloud". Matze is talking to a number of NAS vendors about their potential use of DDFS. Apparently vendors like Synology and QNAP could be interested as a lot of customers are buying their boxes for backup: "DDFS natively is a good product for NAS... The CRUNCH product is a plug-in for NAS."

CommVault's Simpana product can also deduplicate data and write it to tape but - at least according to Matze - "CommVault … is way too expensive."

CRUNCH will, in the future, be able to send its deduplicated data to the cloud, Amazon, etc, via a cloud plug-in, thus bypassing tape. Matze says: "There would no longer be a need for Iron Mountain. You would use the cloud provider to keep tape images for the long term."

Matze will also enable CRUNCH to work with Microsoft's DPM, which will treat CRUNCH as if it were a tape drive. Users get an immediate view of data through a CRUNCH network share.

CRUNCH is available with a service-based model for $200/month: "It equals the price of a couple of LTO tapes a month," says Matze.

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
'Kim Kardashian snaps naked selfies with a BLACKBERRY'. *Twitterati gasps*
More alleged private, nude celeb pics appear online
Wanna keep your data for 1,000 YEARS? No? Hard luck, HDS wants you to anyway
Combine Blu-ray and M-DISC and you get this monster
US boffins demo 'twisted radio' mux
OAM takes wireless signals to 32 Gbps
Google+ GOING, GOING ... ? Newbie Gmailers no longer forced into mandatory ID slurp
Mountain View distances itself from lame 'network thingy'
Apple flops out 2FA for iCloud in bid to stop future nude selfie leaks
Millions of 4chan users howl with laughter as Cupertino slams stable door
Students playing with impressive racks? Yes, it's cluster comp time
The most comprehensive coverage the world has ever seen. Ever
Run little spreadsheet, run! IBM's Watson is coming to gobble you up
Big Blue's big super's big appetite for big data in big clouds for big analytics
Seagate's triple-headed Cerberus could SAVE the DISK WORLD
... and possibly bring us even more HAMR time. Yay!
prev story

Whitepapers

Secure remote control for conventional and virtual desktops
Balancing user privacy and privileged access, in accordance with compliance frameworks and legislation. Evaluating any potential remote control choice.
Intelligent flash storage arrays
Tegile Intelligent Storage Arrays with IntelliFlash helps IT boost storage utilization and effciency while delivering unmatched storage savings and performance.
WIN a very cool portable ZX Spectrum
Win a one-off portable Spectrum built by legendary hardware hacker Ben Heck
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Beginner's guide to SSL certificates
De-mystify the technology involved and give you the information you need to make the best decision when considering your online security options.