Feeds

This is Frankfurt calling: Scattered outbreaks of hot crunchiness

A mixed grill storage smorgasbord newsfeast

  • alert
  • submit to reddit

Intelligent flash storage arrays

SNW Europe This is the second installment of El Reg's coverage of StorageNetworkWorld Europe, aka Powering The Cloud, bringing you another smorgasbord of storage goodness from the biggest storage show in the old coutries. Some of it's hot, and some is even crunchy... so dip in.

BridgeSTOR and DDFS

BridgeSTOR CEO John Matze told us more about his DDFS - Data Deduplication File System - (background here) for tape and the cloud. A company statement reminds us of tape's centrality in data protection:

According to a December 2010 ESG brief titled: “NERSC – Success with Primary Data on Tape”, nearly half of the world’s data is stored on magnetic tape. Indeed, all 10 of the world’s 10 largest banks rely on tape storage for backup and archive data retention. The same can be said for each of the 10 largest telcos in the world; and eight of the 10 largest pharmaceutical firms.

Prior to BridgeSTOR, Matze wrote a first-generation dedupe product for Exar which produced deduplication hardware. But: "Exar cut the product and team in February 2012. It lost 40 per cent of its staff overall."

So DDFS is Matze dedupe mk II.

Matze sees his DDFS technology being most useful in backing up virtual machines. He says VMware VMs and VMDKS don't deduplicate well because they are misaligned to disk blocks. DDFS talks a 512K VM header and converts it to 4096 bytes. The 2MB data containers for the VM data is also put into 4096 byte blocks.

The BridgeSTOR technology is delivered as a virtual deduplication appliance in Windows VHD format called CRUNCH. It runs on existing backup servers or on servers where virtual machine images are stored and deduplicates all VM images - VMware, Hyper-V, and so on. VMs are "crunched" into deduplicated containers, the size of LTO tapes, which are then written to tape, with compression and encryption carried out by the tape drive.

A CRUNCH-written tape contains two files or containers essentially - one for data and the other for metadata. The tape has all the dedupe metadata on it.

CRUNCH Process Flow

BridgeSTOR CRUNCH process flow

BridgeSTOR has found a 20.6:1 data reduction ratio with VMs, turning 33GB of raw data into 16GB. Your mileage may vary of course.

In DDFS “block level” deduplication, blocks of data are “fingerprinted” using a hashing algorithm (SHA-1) that produces a unique, “shorthand” identifier for each data block. DDFS allows the Hash Table to be memory resident. The amount of memory required to hold the hash table is based on the amount of physical capacity being used and the deduplication block size.

DDFS is like a filter driver for Windows. Incoming data is broken up into consistent sized chunks and then processed. Metadata is written to one file and physical data is put into a container. Matze said: "I wrote it, just like I wrote the REO software for Overland Storage (where he was its CTO). Now my team is packaging it." BridgeSTOR is privately owned and there is no venture capital funding.

BridgeSTOR says: "When recovering data, DDFS enables restoring from an unlimited number of 'Recovery Points'. After the 'Initial Data Synchronisation' has been completed, DDFS will build and maintain a recovery 'map' that is based on the frequency of your data deduplication operations.

"For example, running DDFS (in a CRUNCH appliance, for instance) daily will result in the availability of multiple Recovery Points from which to recover data. As in a time machine, you can roll the clock back to a time when the data to be recovered was known to be correct and stable."

Matze's demo of DDFS and LTFS on a MacBook Air with an attached LTO-5 tape drive showed the MacBook Air user mounting LTFS with DDFS. Metadata is then moved to the local disk for cacheing. You can then peruse tape content without touching the drive. Matze starts the DDFS service, mounts DDFS, and, when a file is scanned, it's read off the tape drive.

DDFS can be used to write deduplicated files to tape or to send them to remote sites, including "the cloud". Matze is talking to a number of NAS vendors about their potential use of DDFS. Apparently vendors like Synology and QNAP could be interested as a lot of customers are buying their boxes for backup: "DDFS natively is a good product for NAS... The CRUNCH product is a plug-in for NAS."

CommVault's Simpana product can also deduplicate data and write it to tape but - at least according to Matze - "CommVault … is way too expensive."

CRUNCH will, in the future, be able to send its deduplicated data to the cloud, Amazon, etc, via a cloud plug-in, thus bypassing tape. Matze says: "There would no longer be a need for Iron Mountain. You would use the cloud provider to keep tape images for the long term."

Matze will also enable CRUNCH to work with Microsoft's DPM, which will treat CRUNCH as if it were a tape drive. Users get an immediate view of data through a CRUNCH network share.

CRUNCH is available with a service-based model for $200/month: "It equals the price of a couple of LTO tapes a month," says Matze.

Intelligent flash storage arrays

More from The Register

next story
NSA SOURCE CODE LEAK: Information slurp tools to appear online
Now you can run your own intelligence agency
Azure TITSUP caused by INFINITE LOOP
Fat fingered geo-block kept Aussies in the dark
NASA launches new climate model at SC14
75 days of supercomputing later ...
Yahoo! blames! MONSTER! email! OUTAGE! on! CUT! CABLE! bungle!
Weekend woe for BT as telco struggles to restore service
Cloud unicorns are extinct so DiData cloud mess was YOUR fault
Applications need to be built to handle TITSUP incidents
BOFH: WHERE did this 'fax-enabled' printer UPGRADE come from?
Don't worry about that cable, it's part of the config
Stop the IoT revolution! We need to figure out packet sizes first
Researchers test 802.15.4 and find we know nuh-think! about large scale sensor network ops
SanDisk vows: We'll have a 16TB SSD WHOPPER by 2016
Flash WORM has a serious use for archived photos and videos
Astro-boffins start opening universe simulation data
Got a supercomputer? Want to simulate a universe? Here you go
prev story

Whitepapers

Go beyond APM with real-time IT operations analytics
How IT operations teams can harness the wealth of wire data already flowing through their environment for real-time operational intelligence.
Why CIOs should rethink endpoint data protection in the age of mobility
Assessing trends in data protection, specifically with respect to mobile devices, BYOD, and remote employees.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Protecting against web application threats using SSL
SSL encryption can protect server‐to‐server communications, client devices, cloud resources, and other endpoints in order to help prevent the risk of data loss and losing customer trust.