This is Frankfurt calling: Scattered outbreaks of hot crunchiness

A mixed grill storage smorgasbord newsfeast

  • alert
  • submit to reddit

Application security programs and practises

SNW Europe This is the second installment of El Reg's coverage of StorageNetworkWorld Europe, aka Powering The Cloud, bringing you another smorgasbord of storage goodness from the biggest storage show in the old coutries. Some of it's hot, and some is even crunchy... so dip in.

BridgeSTOR and DDFS

BridgeSTOR CEO John Matze told us more about his DDFS - Data Deduplication File System - (background here) for tape and the cloud. A company statement reminds us of tape's centrality in data protection:

According to a December 2010 ESG brief titled: “NERSC – Success with Primary Data on Tape”, nearly half of the world’s data is stored on magnetic tape. Indeed, all 10 of the world’s 10 largest banks rely on tape storage for backup and archive data retention. The same can be said for each of the 10 largest telcos in the world; and eight of the 10 largest pharmaceutical firms.

Prior to BridgeSTOR, Matze wrote a first-generation dedupe product for Exar which produced deduplication hardware. But: "Exar cut the product and team in February 2012. It lost 40 per cent of its staff overall."

So DDFS is Matze dedupe mk II.

Matze sees his DDFS technology being most useful in backing up virtual machines. He says VMware VMs and VMDKS don't deduplicate well because they are misaligned to disk blocks. DDFS talks a 512K VM header and converts it to 4096 bytes. The 2MB data containers for the VM data is also put into 4096 byte blocks.

The BridgeSTOR technology is delivered as a virtual deduplication appliance in Windows VHD format called CRUNCH. It runs on existing backup servers or on servers where virtual machine images are stored and deduplicates all VM images - VMware, Hyper-V, and so on. VMs are "crunched" into deduplicated containers, the size of LTO tapes, which are then written to tape, with compression and encryption carried out by the tape drive.

A CRUNCH-written tape contains two files or containers essentially - one for data and the other for metadata. The tape has all the dedupe metadata on it.

CRUNCH Process Flow

BridgeSTOR CRUNCH process flow

BridgeSTOR has found a 20.6:1 data reduction ratio with VMs, turning 33GB of raw data into 16GB. Your mileage may vary of course.

In DDFS “block level” deduplication, blocks of data are “fingerprinted” using a hashing algorithm (SHA-1) that produces a unique, “shorthand” identifier for each data block. DDFS allows the Hash Table to be memory resident. The amount of memory required to hold the hash table is based on the amount of physical capacity being used and the deduplication block size.

DDFS is like a filter driver for Windows. Incoming data is broken up into consistent sized chunks and then processed. Metadata is written to one file and physical data is put into a container. Matze said: "I wrote it, just like I wrote the REO software for Overland Storage (where he was its CTO). Now my team is packaging it." BridgeSTOR is privately owned and there is no venture capital funding.

BridgeSTOR says: "When recovering data, DDFS enables restoring from an unlimited number of 'Recovery Points'. After the 'Initial Data Synchronisation' has been completed, DDFS will build and maintain a recovery 'map' that is based on the frequency of your data deduplication operations.

"For example, running DDFS (in a CRUNCH appliance, for instance) daily will result in the availability of multiple Recovery Points from which to recover data. As in a time machine, you can roll the clock back to a time when the data to be recovered was known to be correct and stable."

Matze's demo of DDFS and LTFS on a MacBook Air with an attached LTO-5 tape drive showed the MacBook Air user mounting LTFS with DDFS. Metadata is then moved to the local disk for cacheing. You can then peruse tape content without touching the drive. Matze starts the DDFS service, mounts DDFS, and, when a file is scanned, it's read off the tape drive.

DDFS can be used to write deduplicated files to tape or to send them to remote sites, including "the cloud". Matze is talking to a number of NAS vendors about their potential use of DDFS. Apparently vendors like Synology and QNAP could be interested as a lot of customers are buying their boxes for backup: "DDFS natively is a good product for NAS... The CRUNCH product is a plug-in for NAS."

CommVault's Simpana product can also deduplicate data and write it to tape but - at least according to Matze - "CommVault … is way too expensive."

CRUNCH will, in the future, be able to send its deduplicated data to the cloud, Amazon, etc, via a cloud plug-in, thus bypassing tape. Matze says: "There would no longer be a need for Iron Mountain. You would use the cloud provider to keep tape images for the long term."

Matze will also enable CRUNCH to work with Microsoft's DPM, which will treat CRUNCH as if it were a tape drive. Users get an immediate view of data through a CRUNCH network share.

CRUNCH is available with a service-based model for $200/month: "It equals the price of a couple of LTO tapes a month," says Matze.

Eight steps to building an HP BladeSystem

More from The Register

next story
Sysadmin Day 2014: Quick, there's still time to get the beers in
He walked over the broken glass, killed the thugs... and er... reconnected the cables*
SHOCK and AWS: The fall of Amazon's deflationary cloud
Just as Jeff Bezos did to books and CDs, Amazon's rivals are now doing to it
Apple fanbois SCREAM as update BRICKS their Macbook Airs
Ragegasm spills over as firmware upgrade kills machines
Amazon Reveals One Weird Trick: A Loss On Almost $20bn In Sales
Investors really hate it: Share price plunge as growth SLOWS in key AWS division
EU's top data cops to meet Google, Microsoft et al over 'right to be forgotten'
Plan to hammer out 'coherent' guidelines. Good luck chaps!
US judge: YES, cops or feds so can slurp an ENTIRE Gmail account
Crooks don't have folders labelled 'drug records', opines NY beak
Auntie remains MYSTIFIED by that weekend BBC iPlayer and website outage
Still doing 'forensics' on the caching layer – Beeb digi wonk
Manic malware Mayhem spreads through Linux, FreeBSD web servers
And how Google could cripple infection rate in a second
prev story


Top three mobile application threats
Prevent sensitive data leakage over insecure channels or stolen mobile devices.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Designing a Defense for Mobile Applications
Learn about the various considerations for defending mobile applications - from the application architecture itself to the myriad testing technologies.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.