When the SSD came to storage land: How flashy upstarts got their break
Startups rush in where vendor giants fear to tread
Storagebod Of all the recent changes in the storage landscape over the past five years, the most dramatic is the coming of flash-based storage devices.
Half a decade ago, we were talking about general purpose, multi-tier arrays, automated tiering and provisioning – all coming together in a single monolithic device.
The multi-protocol filer was going to become the dominant model; this was going to allow us to break down silos in the data centre and to simplify the estate.
Arrays were getting bigger as were disks; I/O density was a real problem and generally the slowest part of any system was the back-end storage.
And then came SSDs. While everyone knows that flash-based/memory-based arrays have been around for a long time, until 2008 or thereabouts, they were very much specialist devices and their manufacturers were catering to a niche market. But the arrival of solid-state disk (SSD) – flash in a familiar form factor at a slightly less eye-watering price – was a real game-changer.
If at first you don't succeed, flash, flash and flash again
EMC and others scrambled to make use of this technology: treating them as a faster disk tier in the existing arrays was the order of the day. Automated Storage Tiering technology was the must-have technology for many array manufacturers. Though few customers could afford to run all of their workloads on an entirely SSD-based infrastructure.
Yet if you talk to the early adopters of SSDs in these arrays, you will soon hear some horror stories: the legacy arrays were simply not architected to make best use of the SSDs in them. And, arguably, they still aren’t. While they’ll run faster than your 15k spinning rust tier, you are likely not getting the full value from them.
I think that all the legacy array manufacturers knew that there were going to be bottlenecks and problems; the different approaches that the vendors take almost points to this. Most vendors took several approaches over the years – from using flash as a cache to utilising it simply as a faster disk. And soon many moved from using it as extension of the read cache to using it as both a read and write cache.
Many of the vendors claimed they had the one true answer, but none of them did.
The rise of the upstarts: KerCHING
This gap in the market enabled a bunch of startups to burgeon; where confusion reigns, there is opportunity for disruption.
And the open-sourcing of ZFS soon built massive opportunity for smaller startups, because the cost of entry into the market dropped. However, if you examine many of the startups' offerings, they are really a familiar architecture but aimed at a different price point and market as opposed to the larger storage vendors.
And we have seen a veritable snow-storm of cash both in the form of VC-money but also acquisition as the traditional vendors realise that they simply cannot innovate quickly enough within their own confines.
While all this was going on, there has been an incredible rise in the amount of data that is now being stored and captured.
The more traditional architectures struggle: scale-up has its limits in many cases and techniques from the HPC market place began to become mainstream. Scale-out architectures had begun to appear; firstly in the HPC market, then into the media space and now with the massive data demands of the traditional enterprises – we see them across the board.
Throw in SSDs and scale-out together with virtualisation, and you have created a perfect opportunity for all in the storage market to come up with new ways of providing value to their customers.
The more things stay the same, the more the terminology changes
How do you get these newly siloed data-stores to work in a harmonious and easy-to-manage way? How do we meet the demands of businesses that are growing ever faster? Of course we invent a new acronym: "SDS" or "software defined storage".
Funnily enough, the whole SDS movement takes me right back to the beginning: many of my early blogs were focused on the awfulness of ECC as a tool to manage storage. Much of it due to the frustration that it was both truly awful and was trying to do to much.
It needed to be simpler. The administration tools were getting better but the umbrella tools just seemed to collapse under their own weight. Getting information out of them was hard work. There was no real API and it was easier to interrogate the database directly.
But even then it struck me that it should have been simple to code something which sat on top of the various arrays (from all vendors), queried them and pulled back useful information. Most of them already had fully featured command-line interfaces; it should not have been beyond them to code a layer that sat above the CLIs that took simple operations such as "allocate 10x10Gb LUNs to host 'x'" and turn them into the appropriate array commands – no matter which array.
I think this is the promise of SSDs. I hope the next five years will see the development of this, that we see storage within a data centre becoming more standardised from a programmatic point of view.
I have hopes but I’m sure we’ll see many of the vendors trying to push their standard and we’ll probably still be in a world of storage silos and ponds... not a unified Sea of Storage. ®