Why solid-state disks are winning the argument
Count the reasons
Perhaps the most perplexing question I have been posed this year is: "Why should I use SSDs?"
On the face of it, it is a reasonable question. When it was put to me, however, I just sat there staring at the wall, trying to form a coherent thought. Where to begin?
As it was late at night, I decided that starting with a brief history of storage types and then diverging into a full-blown "why spinning rust was sent from hell to make me miserable" rant was probably the wrong approach.
So I flipped the question on its head. What is the business case for the traditional magnetic hard drive?
Magnetic media is cheaper per gigabyte than SSDs. Under the right conditions, some magnetic hard disks can do sequential writes faster than some SSDs.
If you accidentally delete everything from a traditional magnetic disk, you can probably recover it. Oh, and the write life on a magnetic disk is significantly better than that of an SSD.
As you can see, there aren't many reasons to buy traditional magnetic disk. It is cheaper than SSD – significantly so, even at the consumer level. Okey dokey, I'm down with that argument; but that's what tiered storage is all about.
I can buy LSI RAID controllers that make great hobo hybrid arrays for the smaller business, and once you are beyond what LSI can deliver, go knock on the door of Tintri, Tegile, Maxta, SimpliVity, Nutanix, or a hundred other storage providers that will help you moosh NAND and rust into a yummy storage sandwich. Tiering is not exactly rocket surgery.
I can think of exactly two very specific workloads in which sequential write speeds matter. The first is "single source, massive capture". This starts at recording high-def video at the low end and scales up to capturing all the data coming in from the Square Kilometer Array at the high end.
Massive amounts of data are coming in, they need to be written and nothing else (such as another user or application) will interfere with that drive during the writing process.
Here, I don't think SSDs are worth it because we don't have good, cheap, fast write once read many (Worm) SSDs yet. Worm is exactly what these scenarios call for – a sort of digital negative, but hopefully with a longer shelf life than film. For now, traditional magnetics are the storage for this type of task.
Archival backup is the second workload where sequential write matters. Using SSDs for backups is stupid for a number of reasons. For instance, they are miserable to extract data from if something goes awry with the filesystem.
That doesn't make traditional magnetic disks the winners here, however; they still have to fight this battle with tape.
For everything else I can conceive of, sequential anything means nothing. I have spent a fair amount of time analysing the disk I/O patterns of users as part of my virtual desktop infrastructure research and I have discovered that even modern desktops spend a lot of time cranking out random IOPS.
Users have dozens, if not hundreds, of applications on their desktops, all doing crappy little reads and writes in the background, and very rarely is anything read or written in great big strips.
Servers are worse. Everything is virtualised today and that means that by the time things hit the actual storage medium it doesn't matter if the original request from the server operating system was sequential.
The storage device is handling hundreds of simultaneous requests so it is all a bunch of random I/O in the end.
The meaning of write life
This brings us to the write life debate. Let me start by talking a bit about failure rates. These are difficult to get a handle on across the whole industry because not all retailers release their numbers. Those that do sell more of some products than others, so sample sizes are not equal.
You can't, of course, trust any numbers that come from any of the manufacturers, so we all make do with what we can.
I know that I certainly have my blacklists of products and vendors. I imagine anyone who sells components or whitebox systems does as well.
A good model shows a rate of return of less than one per cent
Fortunately, Marc Prieur at Hardware.fr has been tracking and publishing the component return rates at a French retailer for some time. This gives us a good idea of what return rates look like in the real world.
Traditional magnetic disks and SSDs show very similar statistics for rate of return. A good model shows a rate of return of less than one per cent while a questionable product has a rate of return of about two per cent. Anything above four per cent is considered flat out appalling.
Before we move further on the SSD versus hard drive discussion, we need to throw OCZ right out the window. Let's look at how it has done on a company-wide level since Hardware.fr started tracking SSD returns. Article 3, 2.93 per cent; Article 4, 3.5 per cent; Article 5, 4.2 per cent; Article 6, 7.03 per cent; Article 7, 5.02 per cent; Article 8, 6.64 per cent; Article 9, 2.27 per cent; Article 10, 5.66per cent.
OCZ had an individual product – the OCZ Octane SATA 2 128GB – with a 52 per cent return rate and numerous others above 40 per cent. Prieur calls OCZ's return rates "catastrophic" and I have to agree*.
Now we can find similar examples in the hard drive world if we try hard enough**. The unreliability of the IBM 60 GXP and 75GXP Deskstar hard drives earned that line the nickname Deathstar.
It is a nickname that hung around for more than a decade and even made it into the Wikipedia disambiguation page for Death Star.
On the whole, however, the statistics tend to confirm that after an initial rough patch, SSDs have about the same reliability as traditional magnetic disks. There are, of course, exceptions.
If you sit there and hammer a consumer SSD with high transactions data loads all day long you will burn it out well before the warranty expires.
Similarly, SSDs are a terrible place to do a bunch of log file writes to; eleventy squillion crappy little sub-K writes will burn out the SSDs in no time.
You need to get the right SSD for the job, but that is true of hard disks as well. No matter what the cloud providers tell you, it is a really dumb idea to take the cheapest desktop hard drives you can find and slap them into your NAS or onto a RAID card.
A matter of faith
Consumer SSDs can handle far fewer writes than enterprise MLC datacentre-class SSDs and far, far fewer than SLC drives. By the same token, the consumer magnetic disks often have appalling failure rates north of five per cent and crappy one-year warranty periods.
Ultimately, the warranty period is the real measurement of the product. The warranty is the length of time the manufacturer believes it can simply replace the few defective disks that crop up and maintain profitability for that line. It is an expression of confidence in the reliability of that model expressed in cold, hard accounting.
For the most part, I am finding it easier to get three- and even five-year warranties on SSDs than on hard drives today. Have you seen the warranty periods on the helium drives?
Unless your workload is very specifically single source, massive capture, then you should be running SSDs. Even if you are not running pure SSD, the case for tiered or hybrid storage makes itself.
SSDs are faster. They have way lower latency. They consume less power. They take up less space.
Most importantly, so long as you follow the instructions on the tin when selecting the right SSD for the job, there is absolutely no reason not to buy one. All the negatives that people throw at them have turned out to be nothing more than FUD. ®
*My own experience is a 240 per cent return rate with OCZ. The return rate is more than 100 per cent because many of these SSDs not only failed but the returned disks failed, and then the second, third and sometimes even fourth disks failed. I eventually pulled the whole line from every client and replaced them with those of other vendors at my expense.
**I won't mention the 300GB Velociraptors. I won't mention the… Damn it!