Part flash, all flash or dedicated flash?
Location, location, location
We're now seeing a proliferation of flash across the enterprise, from PCI-E solid state drives (SSDs) in servers to solid state arrays. The options seem overwhelming and the boundaries between the different storage array options are becoming blurred. Here's a quick overview of what is in the market today and how they all fit together.
There are two ways to implement a part-flash array today. The first is to use solid-state drives as part of a tiering strategy. SSDs were slow to be adopted into a tiering model because they were expensive and it wasn't easy to determine for a particular host which LUNs would be the hot ones and which would be relatively inactive. The initial solution was to partition up the host or assign it all flash LUNs, techniques that were both time-consuming and expensive.
Block level tiering changed that; this feature – now available from vendors like IBM, EMC, Hitachi, HP/3PAR and Dell/Compellent – breaks down a LUN into blocks, placing each block on a tier of storage based on activity. In this way, only "hot" active blocks of data sit on SSD.
This is a more efficient solution than standard tiering but is not well optimised to variable workloads where the hot blocks change frequently; the lag time of the array rebalancing to improve performance may negate any benefit (Note: without going into technical detail, solutions such as Compellent can circumvent some of these issues).
The second option is to use flash as a cache for writes. This is the solution used by arrays that make use of the ZFS file system from Oracle. Writes are written to mirrored SSDs using a logging mechanism called ZIL (ZFS Intent Log). They are then de-staged to main spinning disk asynchronously.
Reads are cached in memory using an SSD cache called L2ARC (level 2 Adaptive Replacement Cache). Write performance on the ZFS solution is directly proportional to the amount of cache available. This results in a tradeoff between the cost of cache and the level of write activity on the array.
This leads nicely into discussing solutions that potentially offer better performance but at a greater cost.
Today's solid-state drives have been designed to emulate and to be plug compatible with standard hard drives. It is therefore possible to create an array based purely on SSDs alone. This idea would be extremely expensive (EMC has announced\ it will provide these devices to customers), but do they really guarantee better performance? In the last 20 years since the first Symmetrix came to market, all storage array vendors have worked tirelessly to improve I/O performance in order to mitigate the shortcomings of the main component – hard drives.
HDDs I/O has both latency and seek time delays, which can vary depending on the workload type. Techniques such as striping, mirroring and caching have all been developed and honed to squeeze the best out of a relatively slow storage medium. If we now add solid state into the mix, response times from these devices will be many times faster than those of traditional drives and it is likely that this performance will exceed the array's ability to cope with the I/O load.
Where HDD response time was once the bottleneck, we could see I/O paths, cache and processor performance being the limiting factor. The cost/benefit calculation becomes more complicated to perform, as the limit of performance isn't as easy to predict. Of course if any vendors out there want to send me a solid state array to test, feel free and I'll answer the question for you. To my knowledge EMC is the only vendor certifying and offering full-flash arrays for their traditional VNX & VMAX ranges.
Not all flash arrays on the market are the same. We have seen a new breed of products from Violin Memory, Texas Memory Systems and Kaminario that use flash and DRAM to create storage arrays specially tuned to work with SSD. This means being able to cope with the throughput solid state drives can offer but also being aware of the difference in technology between solid state and traditional hard disks. For example, solid-state drives (and especially MLC drives) have a limited lifetime.
Techniques such as wear levelling are used to mitigate against this and improve what is known as write endurance. Violin Memory takes this a step further and wear level across the entire array, ensuring SSDs last as long as possible. There's no doubt that dedicated flash arrays will have the fastest performance. They have been purpose-built to make use of SSD and have a price point that is only justified by the extreme performance they offer.
So what's the best approach? Well as usual "it depends". Understanding workload profiles and demands is a must; determining whether I/O latency is a restricting factor on application performance is tricky; but most important is determining whether reduced I/O times will result in significant improvements at the application layer. The chances are an SSD solution will provide improvements. The question is: is the increased cost justified? ®
Chris M Evans is a founding director of Langton Blue Ltd. He has over 22 years' experience in IT, mostly as an independent consultant to large organisations. Chris's blogged musings on storage and virtualisation can be found at www.thestoragearchitect.com.
Sponsored: Hyper-scale data management