Original URL: http://www.theregister.co.uk/2012/11/09/snw_europe_2/
This is Frankfurt calling: Scattered outbreaks of hot crunchiness
A mixed grill storage smorgasbord newsfeast
SNW Europe This is the second installment of El Reg's coverage of StorageNetworkWorld Europe, aka Powering The Cloud, bringing you another smorgasbord of storage goodness from the biggest storage show in the old coutries. Some of it's hot, and some is even crunchy... so dip in.
BridgeSTOR and DDFS
BridgeSTOR CEO John Matze told us more about his DDFS - Data Deduplication File System - (background here) for tape and the cloud. A company statement reminds us of tape's centrality in data protection:
According to a December 2010 ESG brief titled: “NERSC – Success with Primary Data on Tape”, nearly half of the world’s data is stored on magnetic tape. Indeed, all 10 of the world’s 10 largest banks rely on tape storage for backup and archive data retention. The same can be said for each of the 10 largest telcos in the world; and eight of the 10 largest pharmaceutical firms.
Prior to BridgeSTOR, Matze wrote a first-generation dedupe product for Exar which produced deduplication hardware. But: "Exar cut the product and team in February 2012. It lost 40 per cent of its staff overall."
So DDFS is Matze dedupe mk II.
Matze sees his DDFS technology being most useful in backing up virtual machines. He says VMware VMs and VMDKS don't deduplicate well because they are misaligned to disk blocks. DDFS talks a 512K VM header and converts it to 4096 bytes. The 2MB data containers for the VM data is also put into 4096 byte blocks.
The BridgeSTOR technology is delivered as a virtual deduplication appliance in Windows VHD format called CRUNCH. It runs on existing backup servers or on servers where virtual machine images are stored and deduplicates all VM images - VMware, Hyper-V, and so on. VMs are "crunched" into deduplicated containers, the size of LTO tapes, which are then written to tape, with compression and encryption carried out by the tape drive.
A CRUNCH-written tape contains two files or containers essentially - one for data and the other for metadata. The tape has all the dedupe metadata on it.
BridgeSTOR CRUNCH process flow
BridgeSTOR has found a 20.6:1 data reduction ratio with VMs, turning 33GB of raw data into 16GB. Your mileage may vary of course.
In DDFS “block level” deduplication, blocks of data are “fingerprinted” using a hashing algorithm (SHA-1) that produces a unique, “shorthand” identifier for each data block. DDFS allows the Hash Table to be memory resident. The amount of memory required to hold the hash table is based on the amount of physical capacity being used and the deduplication block size.
DDFS is like a filter driver for Windows. Incoming data is broken up into consistent sized chunks and then processed. Metadata is written to one file and physical data is put into a container. Matze said: "I wrote it, just like I wrote the REO software for Overland Storage (where he was its CTO). Now my team is packaging it." BridgeSTOR is privately owned and there is no venture capital funding.
BridgeSTOR says: "When recovering data, DDFS enables restoring from an unlimited number of 'Recovery Points'. After the 'Initial Data Synchronisation' has been completed, DDFS will build and maintain a recovery 'map' that is based on the frequency of your data deduplication operations.
"For example, running DDFS (in a CRUNCH appliance, for instance) daily will result in the availability of multiple Recovery Points from which to recover data. As in a time machine, you can roll the clock back to a time when the data to be recovered was known to be correct and stable."
Matze's demo of DDFS and LTFS on a MacBook Air with an attached LTO-5 tape drive showed the MacBook Air user mounting LTFS with DDFS. Metadata is then moved to the local disk for cacheing. You can then peruse tape content without touching the drive. Matze starts the DDFS service, mounts DDFS, and, when a file is scanned, it's read off the tape drive.
DDFS can be used to write deduplicated files to tape or to send them to remote sites, including "the cloud". Matze is talking to a number of NAS vendors about their potential use of DDFS. Apparently vendors like Synology and QNAP could be interested as a lot of customers are buying their boxes for backup: "DDFS natively is a good product for NAS... The CRUNCH product is a plug-in for NAS."
CommVault's Simpana product can also deduplicate data and write it to tape but - at least according to Matze - "CommVault … is way too expensive."
CRUNCH will, in the future, be able to send its deduplicated data to the cloud, Amazon, etc, via a cloud plug-in, thus bypassing tape. Matze says: "There would no longer be a need for Iron Mountain. You would use the cloud provider to keep tape images for the long term."
Matze will also enable CRUNCH to work with Microsoft's DPM, which will treat CRUNCH as if it were a tape drive. Users get an immediate view of data through a CRUNCH network share.
CRUNCH is available with a service-based model for $200/month: "It equals the price of a couple of LTO tapes a month," says Matze.
Sysadmins use storage too...
It is not necessary to have leading edge technology products across the storage spectrum. Sometimes just having products that work and do things that no other vendor does is enough. So it is with cross-vendor, end-to-end system management tools.
SolarWinds software provides network, systems, virtualisation and storage resource management, but is typically bought by end users rather than CIOs. Our impression is that a general IT sysadmin can use the software to troubleshoot problems without having to become a Fibre Channel specialist or a VMware expert.
The suite includes network management - performance monitoring, traffic analysis, configuration management, user device tracking, etc, a storage manager, a log and event manager, server and application management, virtualisation manager, patch management, mobile admin and a web help desk. Many of the parts have been acquired through acquisition and they work together.
We're told by SolarWinds that shared CPU, memory, network and storage resources create contention, sand that storage problems are difficult to diagnose from a system level. The visibility provided by vendor's own tools is poor and limited. If you don't know whether a physical server, an ESXi host or a virtual machine is taking up most of your storage bandwidth you can use SolarWinds' tools to drill down to the physical storage, map it upstream to physical and virtual servers and see where the bandwidth hogging is taking place.
SolarWinds sells lots of low-cost products via a try-and-buy web download scheme, and over a million people have downloaded its free tools.
Virtual Instruments (VI) takes the opposite tack to SolarWinds. Where SolarWinds is broad and relatively shallow, Virtual Instruments is narrow and deep. It produces a leading edge technology product that is tightly focused on Fibre Channel (FC) storage fabrics and provides deep inspection capabilities looking into what's happening across a fabric. There's background info here.
Skip Bacon, Virtual Instruments' CTO
At a briefing in Frankfurt given by VI chief technology officer Skip Bacon, we were told VI revenues were doubling annually and closer to trebling annually in Europe.
He said VI had both big blue chip customers with FC SANs and lots of small and medium enterprises (SMEs) with high criticality apps dependent on their FC SANs: "SAN performance and availability criticality is key for a VI customer," he said.
It's all about measuring performance but not just of the FC part of a SAN-to-end user link, says Bacon. In the storage world VI looks at I/Os through the system, and not just IOPS. It measures FC data traffic at low granularity, using a 2.5ns clock, not hourly - that's far too slow. Bacon says: "You have to measure true performance and measure it at very high frequency and very low latency. But we also talk to switches and other devices to correlate what's happening there."
VI may support AIX in 2013 and Hyper-V support could come if there is enough demand from its customers.
VI works with virtually all the vendors active in FC SANs except one; Brocade. That relationship may improve once Brocade CEO Michael Klayko leaves. Bacon says that the "anti-VI feelings" had surfaced at a high level inside Brocade.
Sort out that back-end
We were briefed by Sepaton's Tim Butchart, VP for EMEA, and talked about Sepaton's latest S2100-ES3 deduplicating array product, one with up to eight compute heads with back-end HDS HUS storage. The ES3 adds 25 per cent more capacity than Sepaton's previous top-end product,
Sepaton is winning business, Butchart said, because existing in-place dedupe systems can't scale enough as the amount of data to be backed up rises ands rises. Its systems are needed in HP's product roster because the HP in-house StoreOnce deduplication backup to disk product cannot scale up as Sepaton's systems do and can't perform as well either. StoreOnce can't replace Sepaton.
A Database Extreme feature enables databases to multi-stream data to the Sepaton boxes, shortening the database backup window. This can be turned on by a database admin person. Butchart kindly told us that an EMC Best Practices guide for its deduplicating arrays says such database multi-streaming should be turned off.
He mentioned roadmap items, such as:
- NFS, CIFS and NDMP interface support.
- A hybrid dedupe environment is coming in which post-process deduplication may feature. This allows the fastest possible data ingest speed.
- It will be possible to upload deduped data from a data centre ES3 to an ES3 located in the cloud.
- Improved manageability.
We came across some HDS development nuggets. The high-end VSP has hardware assistance from seven ASICs (Application-Specific Integrated Circuits). The recent HUS VM has a single ASIC combining the functions of the VSP's seven and delivering around 50 per cent of the performance of a low-end VSP array.
HUS VM can be clustered. The VSP's high-availability controller is used to federate two HUS VMs. This will be enhanced to support four HUS VMs with full IO distribution between the nodes. THe VSP itself will be clustered or federated as well.
Lastly, HUS VM means that common technology is being used between the high-end VSP array and mid-range HUS VM. We might expect a single technology range to emerge, one offering combined file, block and object storage.
Intel and Nevex
Intel has bought flash caching software supplier Nevex, which had a relationship with TMS, before IBM bought TMS. Nevex's logo has sprouted "an Intel Company" message, not that Nevex has released this factoid. The deal first became known about in August and the 451 Research Group has a report about it.
We asked a Nevex contact about this and got no reply. Why would Intel want a flash cacheing software product unless it has a flash cache hardware product to sell? Expect Intel to be more aggressive in this space in the months ahead.
You know what Big Data needs? Tape
We were briefed by two Active Archive Alliance people. David Cerf, EVP for corporate and business development at Crossroads, and Vint Cerf's sibling, said: "Big Data adoption at scale will be seriously inhibited unless it adopts tape."
Peter Faulhaber, president at Fujifilm Recording Media USA, said: "It's not either tape or disk. Archiving has to include both disk and tape, and sometimes SSD in an optimised infrastructure and that's what the Active Archive Alliance provides."
The topics we've covered here are the broad and effective problem solving system management approach of Solarwinds, and the deep and technical Fibre Channel fabric inspection and storage management of Virtual Instruments - with this pair of companies bing two sides of roughly the same coin.
We looked at two examples of deduplication: BridgeSTOR, which is trying to establish itself, and Sepaton, intent on remaining top of the HP and general high-end enterprise dedupe heap.
An insight into HDS's VSP-HUS VM intentions has been garnered and Intel has quietly moved to get its own software flash caching technology. What it will do with that remains to be seen, and will hopefully become clearer by SNW Europe 2013, which will once again be held in Frankfurt. ®