Why storage needs Quality of Service
Makes shared storage play nicely
Storage consolidation looks great on paper. Direct-attached storage is notorious for its inefficiency, with some arrays being just 40 per cent occupied or even less.
Providing an Oracle database with 10,000 IOPS could mean aggregating dozens of 15,000 RPM drives, and unless the database is several terabytes in size that is a lot of wasted space.
The alternative is shared storage, probably with virtualisation and thin provisioning to allocate physical disk capacity more efficiently, and perhaps with a Flash tier or cache to boost performance.
As well as reducing wastage, shared storage can also bring other advantages, not least a reduction in the number of points of management.
But what happens if one of your clients or applications doesn't play well with its fellows – if it is badly behaved and greedy and doesn't realise that in shared storage “shared” is the operative word?
In many systems, it is all too easy for one application to become the bully in the playground, grabbing too much for itself and leaving the other children crying in the dust.
“Quality of service is more crucial than people give it credit for. It's a small thing but without it a lot of the value propositions of shared storage go away,” says John Rollason, NetApp's director of product, solutions and alliances marketing.
“Essentially, if you don't have QoS on a shared storage platform, you can't guarantee overall QoS when users move to a virtualised environment. Virtualisation also makes the I/O a lot more random.”
A common example is when applications were not designed to share and have different access patterns, says Alex D’Anna, director of solutions consulting EMEA at Virtual Instruments.
“A really interesting use-case is service-hungry applications,” he says. He cites the example of a customer which had problems with its crucial SAP installation, despite apparently having plenty of SAN capacity to hand.
“SAP is there to help you manufacture, but you also need data warehousing and business analytics for forecasting. The amazing thing to us was that the customer had completely different read/write patterns and the data warehousing was completely eating up its 8Gbps Fibre Channel SAN,” D’Anna says.
He adds that the challenge is magnified once you move into the cloud. "With cloud storage, people are looking for ease of provisioning and so on. We work on the assumption that a share-everything philosophy will eventually dominate. On that platform you need a picture of what is happening," he says.
"For example, when there are performance problems people ask to be put back onto dedicated storage. But in the cloud you can't do that any more."
Feed the hungry
Frank Reichart, senior director of product marketing at Fujitsu, agrees. “QoS is necessary for storage consolidation. There is no way around that,” he says.
“If you do nothing, the server that demands the most performance will get it – and if that's your business intelligence system, then response times for the more time-critical production system will suffer. QoS also impedes the service level agreement-driven organisation, and if you cannot set QoS, you punish the user who has simple applications.”
The business intelligence (BI) problem is a big one because more and more BI users want to run their queries against the production data, not least because of the cost of setting up a dedicated data warehouse and the time needed to copy data there.
Anyone else trying to use that storage might as well take a coffee break because they are not going to get a lot done
It is not the only example, though. A heavy database query could also easily soak up all the I/O available, starving the web and email servers that are sharing the same storage. As for the impact of a VDI bootstorm on shared storage, anyone else trying to use that storage might as well take a coffee break because they are not going to get a lot done.
All of this is especially true for public-cloud operators, whose very existence and profitability is predicated on being able to share resources such as storage across multiple customers or tenants.
Increasingly this also applies to IT departments, as they too must service multiple internal clients – and typically for less and less money.
So what are the storage developers doing to deal with the issue and ensure equitable and appropriate access, without forcing you to solve the problem by expensively throwing storage at it?
The first thing is obviously to add QoS mechanisms, assigning priorities to applications. Stopping rogue applications or clients requires other approaches. One of the simpler ways to do it is to apply I/O rate limits to badly behaved applications so they don't grab everything available.
That can be too simple, advises Jesper Matthiesen, the CTO at Debriefing Software. “I don't consider bandwidth throttling to be a good thing because if the capacity is there you should use it,” he says.
Another route, and the one chosen by a number of leading-edge developers such as Fujitsu, NetApp and NexGen (now part of FusionIO), is to enforce minimum application data throughput levels rather than maximum.