EMC: Backup is broken, do you hear me? Now buy this other thing
It's copy data management, dude, not data protection
Comment EMC says backup is broken, and that infrastructure should now be able to protect itself using intelligence it possesses. It is hoping users will move towards copy data management and away from leaving idle data in silos waiting for something to happen.
The realisation that EMC's backup thinking had changed started with a deeper take on EMC's recent re-org from its Virtual Geek, Chad Sakac, SVP for global pre-sales. Then we heard from one of its CTOs, Stephen Manley, and then finally perused a blog by Actifio chief marketing mouthpiece Michael Troiano.
So ... EMC recently combined its VMAX and VNX product lines in a new Enterprise and Midrange Systems Division (EMSD). It took VPLEX and Recoverpoint out of EMSD and put in with the BRS group of products in a new Data Protection and Availability Division (DPAD).
We, El Reg's storage desk, thought this was just to rebalance the relative sizes of EMSD and DPAD. Wrong: it reflects changed thinking in EMC about having multiple copies of data in different silos.
Here's what Chad Sakac has to say about the formation of DPAD:
The organisational change and name change reflects the fact that backup, recovery, continuous availability, snaps, replicas - and their application integration - are all reflective of a common continuum of copy management of the things stored on primary storage. ...
The fundamental driver behind the fundamental driver behind backup, recovery, availability, snapshots, protection (all wrapped up into the applications themselves) - these are all fundamentally about "copy management" - managing point in time copies and representations of information that primary systems are serving up in a transactional way. (all wrapped up into the applications themselves) - these are all fundamentally about "copy management" - managing point-in-time copies and representations of information that primary systems are serving up in a transactional way.
There are two classes of data here: production data generated by primary transactional systems and all the copies of it used for analysis, backup, disaster recovery, availability, snapshots, clones, replicas, archives, test and development - nine separate categories.
Keep this in mind and watch a video interview with Stephen Manley, the then BRS chief tech officer, and now, we suppose, DPAD's CTO.
Manley says backup is broken because it can no longer cope with the vast amounts of data it has to protect. He identifies three waves of backup thinking:
- Server-centric backup - the traditional paradigm,
- Infrastructure-centric data protection - what we are switching to,
- Cloud service-centric data management - the coming wave.
Stephen Manley's three data protection waves
Manley pinpoints two issues with backup. Firstly it is slow and doesn't scale well, and, secondly, the data is in a proprietary format, largely idle, and unable to be used for anything else.
With snaps, clones and replicas the IT infrastructure can start protecting itself, using intelligence in virtualised servers, applications and storage arrays: "Let the infrastructure protect itself and that way I can do backup, recovery and archive because I've got the data in the format where it's usable," he says.
This is better but not enough, with Manley saying:
These are all just versions ... copies of my data ... They shouldn't just be idle, waiting for the disaster, waiting for the problem to happen. I could use these for test and development. I could use these for analytics and data mining. I could use these for data distribution because I've got sites all around the world. These copies are viable for something more than just waiting for something to go wrong.
He has blogged about the general problem here.
Put the Manley and Sakac views together and we have EMC embracing the idea of copy data management.
Copy data management
Having different copies of data in different silos is a great way of selling hardware silos, and EMC has a lot of data storage silo products to sell. But it can read the runes, sniff out what startup competitors are doing and see if it makes sense for customers down the road. Copy data management is going to happen, if you go by what these various EMC execs have said.
If it is right then every supplier of backup, archive, analysis, availability, disaster recovery, test and development data copies will be affected with, potentially, their data copy generating function taken over by some centralised, more space-efficient and optimised product technology that sprays out copies to users wherever they are.
There is one startup offering such copy data management already, Actifio, and it gets a single copy of production data and then distributes virtual copies as needed for analysis, archiving, availability, backup, disaster recovery, distribution, test and development. Since these are virtual copies they take up much less space than physical copies, so a customer's storage array and copy generation software - like backup applications - costs go down.
Actifio is currently on v6.0 of its main product technology and says it is growing very fast. EMC has grouped its data copy-related resources in the new DPAD organisation and will, we think, commence the development of a product strategy designed to bring copy data management functionality to fruition, possibly as a data service plugged in to ViPR.
El Reg storage desk thinks it will take at least six months for any product to appear, if not longer, and the development-scoping exercise may conclude that acquiring technology is preferable to developing it in-house. ®