It pays to fake it: Test your flash SAN with a good simulation

How to measure your storage performance

Flash Gordon

It is pretty obvious that storage systems vary. You could reply, with some justification: “No shit, Sherlock!”

What is less obvious and more useful to know, however, is how and why they vary and how the variation – not just between all-disk, hybrid and all-flash arrays but even between different arrays of the same class – can affect your applications.

Increasingly the answers to those questions are being found in simulations, whether that's simulations of entire SANs or simply of existing workloads designed to stress test both the SAN and the attached storage systems.

And as all-flash and hybrid arrays, with their drastically different performance characteristics, enter the mainstream market, those questions – and the simulations – have become important to all of us, not just to high-end organisations.

Of course, power users of SANs have needed to load-test their networks for years now, if not decades. Just as with any other type of network, bottlenecks or hotspots can emerge, especially as the network load ramps up.

Some have also performance-profiled their storage systems to find out what workloads each vendor is good at and assess the benefits of tiering.

What is changing today, though, is that the variation can be even wider, thanks to the eager adoption of seductive new technologies, most notably flash memory and software-defined storage.

“Flash and SDS are the two stochastic shocks to the industry, and of those flash is the biggest,” says Len Rosenthal, marketing vice president at storage testing specialist Load Dynamix.

“In the true enterprise space, performance matters 90 per cent of the time. In storage systems they are all using similar Intel hardware, but the configuration – the amount of flash versus the amount of disk – will differ.

“Mostly though the difference comes down to the software architecture, for example how they handle block sizes. Some deal with fixed block sizes, some with variable block sizes.”

Big squeeze

However, Rosenthal says that what differentiates arrays that include flash most is the two data reduction technologies: data compression and data de-duplication. These matter because flash is the first storage generation with latency low enough to make it practicable to do data reduction on primary storage.

Vendors have seized upon this to cut the effective per-gigabyte cost of flash because it can make a gigabyte of flash worth maybe 4GB of spinning disk. It is the same argument that tape vendors deploy at the other end of the storage spectrum, using data compression to claim a nominal capacity for their tapes that is about 2.5 times their raw physical capacity.

“De-duplication and compression are very important for flash because they are what makes it affordable,” says Rosenthal.

“However, they also have a huge performance variation by workload because they are all proprietary technologies and they all work differently – different block sizes, different metadata mixes, different directory structures and so on.

“For example, in an all-flash or hybrid array with de-duplication and compression we have seen differences of 5x on identical workloads, based on different compression algorithms. They all do garbage collection and wear-levelling differently too, and there can be a big impact on virtualised applications from queue depth.

“The key message to users of performance-sensitive applications is that algorithms and software architectures have a huge impact.”

One option is to do what HP-3PAR does and enable or disable de-duplication volume by volume.

“There is no point turning it on if it's not appropriate for that workload,” says Craig Nunes, vice president of marketing for HP storage.

“The problem is that some arrays offer less granular control than others and just de-duplicate everything. People have got the idea that it's just part of the stack and it's all or nothing, but it doesn't have to be.”

He says this is also why HP built its Flash Advisor toolset into its systems: you run this against a workload and it characterises the applications, telling you what storage best suits each one.

“So it might say run this application on all-flash, this on a thin tier with flash caching, and this on a hybrid tier with adaptive optimisation,” he says, adding that you can then deploy those multiple storage classes all within the same array.

It is also good to remember that for many users, the general advantage of flash over all-disk should outweigh any variation between arrays.

“The benefits of flash aren't cost, they are being able to run enormous numbers of snapshots without the snapshot overhead. They are cost savings, they are productivity gains,” says Nunes.

“Don't just think about virtual servers, VDI and application acceleration. Think wider: are there organisational benefits? Can you offer services that you couldn't offer before?

“For example we have a healthcare customer that flipped its clinics to flash for its virtual desktops. The doctors need to log in and out as they move between rooms and have greatly increased their productivity just by being to log in and out far faster.

“Another customer couldn't open new stores in different time zones because it interfered with its four to five hour inventory consolidation run overnight. Flash took that job to an hour and now the company can expand.”

Loaded question

Even if your applications are not going to hammer the storage, it is essential to understand what sort of load each will place on its storage system, notes Nick Slater, storage product manager at Lenovo UK.

“Flash storage has great performance benefits for random workloads over spinning disks, but it’s still relatively expensive and has no performance value when it comes to sequential write workloads,” he says.

He adds that this random/sequential divide is a large part of what is driving the uptake of hybrid arrays.

“For example, a storage system that is five to 10 per cent flash can handle and deliver about 80 per cent of the random I/O performance, leaving the spinning disks free to handle the sequential write workloads that are toxic to flash,” he says.

Rosenthal agrees. “There are some applications clearly flash is not right for, but most transaction-oriented applications will see an advantage,” he says.

“For example video streaming is a sequential workload, so spinning disk is generally fine. Transaction processing is very random in data access, so flash is good. Most applications are in the middle, so hybrid arrays can handle the majority of applications

“The easiest thing is always to over-provision, and if you have the budget that's fine"

“The question is what is the right mix of flash and disk, and how do you determine the amount of flash needed? You always need a little headroom but how much?

“The easiest thing is always to over-provision, and if you have the budget to spend two or three times as much on flash, that's fine. But most people want to control their costs. Or an organisation might want all-flash because it's the newest thing but can't afford to replace its whole infrastructure in one go, so then you have to ask which applications to put on flash.”

That means testing both your applications and your current and planned storage, suggests Gavin McLaughlin, strategy and communications vice president at storage array developer X-IO Technologies.

“We always say to customers, 'Go and test it in your real environment.' The best test ever will be your own data – if you have the facility to replicate your own environment, that's great,” he says.

“We are starting to see much better performance measurement tools out there, and we are seeing more resellers and system integrators investing in those to guide their customers.”

He adds that it is an opportunity for them to start adding value once again, after years of being pushed towards becoming low-margin distribution and delivery channels.

SANity testing

This kind of performance measurement is where Load Dynamix comes in, says Rosenthal. Its software can characterise and model an application or workload, allowing it to be played back at extreme scale to test how the SAN and storage systems respond to increasing loads, or to the periodic spikes in demand that are often found in the real world.

“The company was founded by people from the network performance testing world, and started out emulating storage protocols. We emulate Fibre Channel, NFS, SMB, CIFS, Amazon S3, OpenStack, object storage protocols and so on, so we could apply those techniques and do performance testing on storage networks,” he says.

“Then vendors started bringing us into customer accounts to prove their proposals. For example, we would characterise their workload and run it as a simulation.

“And then user organisations started buying it, big companies that are constantly changing their infrastructure around – AT&T has multiple devices of ours, for example. It was doing performance profiling on storage systems to see what workloads each vendor is good at, finding network hotspots, identifying the benefits of tiering and so on.”

But even as cheaper flash arrays are broadening the market for this kind of technology, a performance testing tool such as Load Dynamix Enterprise still starts at around £60,000, and although the return on investment can be huge, there is often a shortage of budget for this kind of work.

Whereas a large organisation's server and network teams typically have funding allocated for measurement and testing tools, the storage team is too often the Cinderella of the family.

For those on a tight purchasing budget, a possible alternative simulation scheme is to create a few hundred virtual machines and then write Iometer scripts to throw loads at them. Another is, as McLaughlin suggests, to pull in consultants or a system integrator who have both the test tools and the necessary skills.

Essential knowledge

A third option is to do what the other teams have done and pitch the business benefits of investing in the tools and skills needed.

“One way for a business to drive down cost is to set up its own storage management practice, instead of relying on third-party advice,” suggests Rosenthal.

“As for the skillset needed, well, you need to understand the I/O profiles of your storage system and of the applications hitting it. That's the random/sequential mix, block sizes, those basic metrics.”

“Look under the covers, and ask the vendor's technical people where their algorithm works best and where it doesn't,” says Nunes.

“The more you can understand performance, characterise applications and understand the fit for your business, the better you can decide. You don't have to become a storage engineer; it's more like buying a car and knowing the questions to ask.”

Rosenthal adds: “You also need to understand the metrics that the vendors give you, whereas they haven’t mattered so much in the past. With the storage industry going through huge change at the moment, you need the skills to evaluate new products.

“SNIA webinars are a great source for this kind of data. The vendors also do useful webinars, particularly on new technologies, but take the vendors with a grain of salt: trust, but verify.

“There are also a number of storage conferences to go to, such as Powering the Cloud and Tech Unplugged, plus independent storage education seminars. And of course it is worth following the online storage community and the top storage bloggers.”

Whichever route you choose, Rosenthal concludes: “We are going through major change in the storage industry. It has been somewhat static for ten years, but a combination of new technologies is changing that. The business is shifting, with startups growing like crazy, even while budgets are largely flat.

“The move to SDS running on commodity storage means a radically different architecture from NAS and SAN. How do you know your applications will perform on Ceph, say?

“In my view, Ceph's not ready for production yet, though it will be in two years. So it might cut your storage bills by two-thirds, but which applications could realistically go on it?”

The simple answer is that characterisation and testing, for real or in a simulation, is the only way to be sure. ®

Biting the hand that feeds IT © 1998–2017