Are you crying out for virtualised storage tiering?
Tiers before bedtime
You can virtualise pretty much any technology these days, so the thinking goes, and that includes storage. This means hiding what's going on behind a virtualisation layer - including tiering. But why tier?
Remember the old equation that you can have any two out of faster-cheaper-better but not all three? It's no secret that the faster your storage, the more you pay. Driven by the cost of enterprise storage and by data growth rates of around 50 percent, enterprises are adopting tiering as one leg of a resolution to the problem, two of the others being deduplication and thin provisioning.
The concept of tiering is essentially a simple one, and entails storing data on the type of storage that's most appropriate on a cost-benefit basis. In other words, the more valuable a piece of data is, the faster - and more expensive - the storage infrastructure on which it should be stored. The converse is also true.
Data migration is "a very stressful, manually-intensive task"
So instead of storing everything on one storage medium, you put data to which the fastest access is required on the fastest-performing storage system, while data for which long access times are not a problem live on the slowest, cheapest tier. In practice, this usually means that, for example, mission-critical databases live on high-speed 15k rpm SAS disks, or even SSDs, while end users' Windows shares sit on SATA disks. Long-term archives are held on tape (or MAID - massive arrays of idle disks), where it doesn't matter that access times can be measured in minutes or even hours.
The alternative is to leave things as they are, with all data on the same storage system – a single-tier configuration – which in most cases is not an option. Given today's data growth rates, it would mean simply adding more storage every couple of years and then having to re-organise it to fit the new capacity: a very expensive, disruptive and time-consuming exercise.
The question is how you get from here to there. It isn't cost-effective to migrate data manually so those vendors who implement a form of automated tiering - and that's most of them - do so with policies. Compellent was the first out of the blocks with its data progression feature, which provides policy-driven, block-level automation. This means that it detects when a piece of data has been accessed and moves it up a tier. After a while, if not accessed the data is marked as aged and can be moved down a tier.
In theory, you set the policies for how aggressive you want this process to be, while the software figures out how to do it while leaving some disk space. In practice, it's not as simple as that, as you will still want to allocate some types of data as suitable for various tiers based on business-related or other criteria rather than just access time.
Other systems work at the file or even the LUN level. Even if your storage system doesn't offer this feature, you can set up a tiering regime by adding a controller that virtualises the underlying storage, allowing you to allocate tier levels to pools of heterogeneous storage. While automating migration, this technique can't be described as a truly tiered system but it can help you move in that direction while filling a immediate need. IBM and FalconStor are among those who sell such controllers.
Storage consultant Marc Staimer, of Dragon Slayer Consulting, described data migration as "a very stressful, manually-intensive task, so tiering is only practical when it's policy-based."
So the key is to aim to automate tiering and migration as much as possible, which can involve a lot of upfront work to ensure that data is correctly categorised. ®
I think you're missing the point.
Tiering is about efficiency and cost - making sure that data is on the right type of disk at the right time. Access time is not a relevant metric for tiering, it is generally based on frequency of access so that idle/stale data can sit on cheaper disk until it becomes relevant again.
Irrespective of the tier used all disk should be resillient so data integrity is never compromised.
What you should be focusing on is complexity. As much as most vendors have some form of tiering technology, very few have an implementation that are easy to use and sufficiently granular to not compromise performance.
SSD (for many vendors) is a great marketing tool but does little to make tiering relevant or to improve performance. In a well configured system very few applications will benefit from the reduced latency SSD offers. In fact, in highend arrays, many cache algorithims prevent getting the full SSD benefit anyway. The real benefit of SSD is being able to get many IOPS from relatively fewer disks. Due to cost and size constraints this only works if you can tier data on a sub-LUN basis.
In my experience, very, very few products out there can deliver this in a meaningful and sustainable manner.
Why worry about policy tiering?
Surely most storage requirements can be covered by technologies like Oracle's Sun ZFS storage appliance. ZFS Read/Write (Memory+SSD) caching takes care of dealing with hot data with cheap backend SATA storage for all data. Only sustained high intensity write operations may not suit.