Feeds

Why should storage arrays manage server flash?

One butt to kick if it all goes, erm, pear-shaped

Beginner's guide to SSL certificates

EMC's Project Lightning has a storage array managing flash cache in servers networked to the storage array. Dell is thinking along similar lines. This is supposed to provide better storage service to the servers. Really? How?

An enterprise infrastructure architect working in the insurance area got in touch to give me a use case in which it makes perfect sense.

His scenario envisages an ESXi server farm, masses of virtual machines (VMs), and a NetApp filer, with cache in its controller (PAM) running NetApp's A-SIS deduplication. This is what he wrote:-

ESXi Farm and a filer

"Take an ESXi farm which is connected to a NetApp filer. The volume on the filer has an A-SIS job run on it every night which consolidates the identical blocks down to a single instance. This pays big dividends as the space utilisation doesn't grow linearly with the number of VMs you provision.

"You can deploy PAM read cache in the filers and cache the actual blocks on disk rather than the dehydrated blocks served up, so yes – you get a high cache hit rate, meaning also you don't need to grow the number of physical spindles for performance with the number of VMs you provision.

"The problems lie in the scalability of the filer heads and the latency incurred by the network stack. The drawback is that the filers need lots of CPU to service the number of requests coming from the hundreds of VMs residing in the same blocks. This limits the number of VMs you can provision on 31xx and 60xx filers to around 300-500 before the CPU in the filers get really hot, and limits the performance of the VMs themselves due the 5-10ms latency of a typical storage request incurred by the network stack.

"You can upgrade your filers to the latest and greatest and spend life-changing sums of money – bang – CPU problem solved for another three years, until the same issue occurs again and you fork out another life changing sum because the filers can't keep up with the growth of your sprawling VM estate. This doesn't fix the network stack latency however."

Put PAM contents into server flash cache

"If you could take the contents of the PAM card in the filer and replicate this into flash cache in the ESX host, it serves two purposes. First, it reduces the network stack latency back to the filers which improves VM performance and subsequent consolidation ratios on each ESXi host, and it increases the length of time before the next big spend cycle when you need to upgrade your storage and spend another huge sum to fix the CPU issue. Some people need shared storage, but want the performance of local SSD. It is also much easier to sign off a couple of grand per ESXi host as you purchase them than to spend huge sums of money every few of years on new storage controllers."

You can upgrade your filers to the latest and greatest and spend life-changing sums of money – bang – CPU problem solved for another three years, until the same issue occurs again and you fork out another life changing sum.

Server agents for rehydraytion

At this point I thought that rehydration might need agent software in this use case, with my thinking going like this:

1) A deduplicated file equals unique data segments plus pointers to master copies of duplicated data segments.

2) In this use case example, we have the master copies in the ESXi server's flash cache and the unique data segments in the storage array – the file having parts in two locations. The I/O request for a VM then involves the storage array-held data and the server flash cache data being combined as the deduplicated file us rehydrated. Where is the rehydration done?

3) I'm guessing it is executed by the storage array, but wouldn't it need an agent in the ESXI server to combine the cached data with the data coming from the array to build the rehydrated file?

Our correspondent said: "Yes, the cache in the hosts would require some intelligence or an agent to do the rehydration – ie, a table which says, when I access this reference, actually go and get it from this other reference. If I don't have it, get it from the shared storage and cache it for next time. Some component of the NetApp filer's caching algorithm [would] need to exist in the host."

He emphasises that the storage array and flash cache should come from a single supplier, so you should have one throat to choke if problems occur.

It doesn't have to be a NetApp filer. This use case will work in principle with an EMC VNX array controlling server-located flash cache: that is Project Lightning as EMC describes it.

A second use case

Our correspondent devised a second use case:

"Another application is something like a Data Warehouse (DWH) system. This requires vast amounts of disk performance and very low latency. I know of a DWH team that use servers with locally attached SAS disk as this gives them a solution that cannot be affected by other tasks, meaning they get predicable (but not optimal) results.

"The system copies an entire database dump from a very large line of business system every day and runs huge amounts of number crunching on it.

"The speed this system can complete the process has a direct impact on the profitability and competitiveness of their organisation. The trade-off of the current DWH model is that they spend a lot of time copying working set data around over the network, when they could take advantage of snapshot technology in shared storage to get data where it needs to be instantly. They don't want to incur the extra latency and unpredictable performance of shared storage however.

"Having local flash cache could be the way forward if it could keep a large proportion of the working set in cache and smooth out the load on the shared storage. They would not use deduplication as in the ESX case above as it would be little benefit, but they would only sign up to this if it was 100 per cent supported by one vendor with one butt to kick if it all went pear-shaped."

All this makes good sense to me. Does it to you? Can you see the sense in storage arrays managing server-located flash cache and loading them with data? ®

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
The cloud that goes puff: Seagate Central home NAS woes
4TB of home storage is great, until you wake up to a dead device
BOFH: WHERE did this 'fax-enabled' printer UPGRADE come from?
Don't worry about that cable, it's part of the config
Azure TITSUP caused by INFINITE LOOP
Fat fingered geo-block kept Aussies in the dark
You think the CLOUD's insecure? It's BETTER than UK.GOV's DATA CENTRES
We don't even know where some of them ARE – Maude
Want to STUFF Facebook with blatant ADVERTISING? Fine! But you must PAY
Pony up or push off, Zuck tells social marketeers
Yahoo! blames! MONSTER! email! OUTAGE! on! CUT! CABLE! bungle!
Weekend woe for BT as telco struggles to restore service
prev story

Whitepapers

Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Internet Security Threat Report 2014
An overview and analysis of the year in global threat activity: identify, analyze, and provide commentary on emerging trends in the dynamic threat landscape.
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.