Why storage needs Quality of Service
Makes shared storage play nicely
The idea is that once the total demand for storage performance has exceeded the system's ability to deliver IOPS or Mbps, then instead of granting I/O requests on a first-come first-served basis, the system ensures each server gets its minimum number of IOPS. After that, it uses any remaining headroom to deliver additional IOPS to high-priority workloads.
The incorporation of Flash is also important in making storage QoS feasible, as lots more IOPS is available for prioritisation when you cache items in Flash.
The final key element is automation, according to Reichart. "Actually setting QoS parameters is quite another matter," he says.
"Typically the metrics would be response time, but to get say sub-5ms for a database is a very complex task. You could have to play with 20 different parameters. Even then it's a moving target because once you have set up the QoS you want, another application could come in and you have to start again from scratch.
"What you want is automated or semi-automated systems that are self-optimising so the administrators can just define the requirements and let the system do the rest. It also needs more reporting and monitoring – which LUNs use which storage, which application is on which storage tier.
“The alternative is that people may even de-consolidate or look for point solutions, which is clearly inefficient and leads to over-provisioning."
That means capacity planning and modelling, with careful attention to all the performance data that your storage systems are already generating.
Start by modelling the physical workload capacity, then model the hosts; for example you can say a database server needs X number of I/Os.
Then you can begin to define monitoring policies and migrate high-performance hosts onto a high-performance storage tier or low-performance hosts onto low-performance storage.
A good technique can be to divide your primary storage into tiers, typically for high, medium and low performance. Next, you define service-level agreements for each tier: how many I/Os the storage can handle, what limit it should have on latency, what availability levels it should offer and so on.
Ideally, you then want to have the system do as much as possible of the repetitive grunt-work of data movement for you – and thankfully, the nuts and bolts and tools exist pretty much to automate it all.
Matthiesen warns, however, that although some of these processes can and should be automated, the tools involved are powerful and can be dangerous. It is a case of great power coming with great responsibility – and great risk.
For example, moving a host from one tier to another can involve hundreds of megabytes and require lots of I/O, and just moving that amount of data around is going to affect your other systems.
"You have to go with caution because this is the core of your company," Matthiesen says. "You can do horrible things moving data around. It needs great care.
“Major configuration changes still need human knowledge because it takes business understanding as well as technical insight." ®