We are entering the data-aware infrastructure era
Storage systems analyse and operate on their own data
Comment Last week, during SFD8, I got to meet with two incredibly interesting storage startups: Cohesity and Coho Data.
They do different jobs, since they target two different markets (primary and secondary storage) but both have developed one of the most compelling features you can find today on the market… And I’m sure they’ll be followed by others.
More than storing data
I usually describe modern storage systems with this slide:
Differentiation and value comes from analytics while data services are table stake. Everyone has data services.
Now, after these two presentations I' ll have to make a change because things are evolving. In practice, these vendors can run code in their systems to do operations on stored data. At the end of the day this could be considered a smart data service that can open doors to an endless number of opportunities.
All these next generation storage systems have extensive API sets, and now they can be programmed from the inside too. Part of their resources can now be used to do operations that are usually performed by other components of the stack but with more efficiency, control over data and storage system behavior.
The real advantage
Having access to programmable storage has multiple advantages. Analytics can be taken to the next level while system administration can be hugely simplified as well!
For example, think about doing an automatic search of sensitive data and a flag is raised (and an action is taken) every time a new particular file lands in your system (e.g., checking for credit card numbers not stored in an encrypted format). Or making data conversions on the fly (e.g., provide a system to normalize all the log files that are sent to a specific directory). The only problem here is the limits of your imagination.
The demos shown at the event were impressive (see the videos here and here) and, in perspective, they put storage administration and management in a totally different light. Potentially you can do everything and build amazing functionalities into the storage that can easily bring cost savings and better agility to operations as well as business.
It’s time to awaken the Dev in your Ops
All storage admins of this earth have written scripts during their working life (and many of them still use Excel to manage their storage, don’t they… but this is another story). Now is the time to take their coding skills to the next level.
Closing the circle
I’m thrilled. The potential is huge, especially because it’s not just the data but the integration between the storage system and the data that will make the difference. Programmable means customizable or, even better, personalized.
Since most of the systems out there are now based on Linux (or some sort of Unix-like OS), I don’t think enabling such a feature will be very difficult (if your vendor has exposed a decent set of APIs already). And it will change the way we interact with storage and data… allowing us to carve out the most from it while applying only minimum effort.
We are already used to app-aware storage (with systems which are well aware of the OS, hypervisors or apps they are dealing with), but now we are heading towards data-aware storage where efficiency, agility and pro-activeness are taken to the next level.
Now, the problem is re-designing my slide… how about this? ®