We trust computers to fly jets... why not trust them with our petabytes?
Wait, hold on, software-defined storage ain't so crazy
Storagebod blog Listening to The Register's latest Speaking In Tech podcast got me thinking a bit more about the craze of software-defined networking, storage and whatever next. I wondered if it is a real thing as opposed to a load of hype.
For the time being I’ve decided to treat software-defined stuff as a real thing, or at least as something that may become a real thing.
So, software-defined storage?
The role of the storage array is changing; in fact, it's simplifying. That box of drives will store stuff that you need to have around just in case or for future reference. It's for data that needs to persist. And that may sound silly to have to spell out, but basically what I mean is that the storage array is not where you are going to process transactions. Your transactional storage will be as close to the compute nodes as possible, or at least this appears to be the current direction of travel.
But there is also a certain amount of discussion and debate about ensuring quality of service from storage systems to guarantee performance and how we implement it in a software-defined manner: how can we hand off administration of the data to autonomous programs?
This all comes down to services, discovery and a subscription model. Storage devices will have to publish their capabilities via some kind of software interface; applications will use this to find out what services and capabilities an array has and then subscribe to them.
So a storage device may publish its available capacity, IOPS speeds and latency but it could also reveal that it has the ability to do snapshots, replication, and thick and thin allocation. It could also publish a cost associated with this.
Applications, application developers and support teams will make decisions at this point as to what sort of services they subscribe to; perhaps a required capacity and IOPS performance, perhaps take the array-based snapshots but do the replication at an application layer.
Applications will have a lot more control about what storage they have and use; they will make decisions whether certain data is pinned in local solid-state drives or never gets anywhere near the flash; whether it needs something brilliant at sequential storage or random access. It may have requirements for recovery time objectives (RTO) and recovery point objectives (RPO); thus allowing it to make decisions about which transactions can be lost and which need to be committed now.
And as this happens, the data centre becomes something that is managed as opposed to a brainless silo of components. I think this is a topic that I’m going to keep coming back to over the months. ®
Sponsored: Hyper-scale data management