From spinning rust to SSD: What to wear when things are looking worn
We've come a long way from mechanical drives
My proper storage friends have a term: “spinning rust”. It’s used to refer to traditional hard disks, which store data by magnetising cells on ferromagnetic layers on rotating discs (“platters”). “Ferro” = iron; iron oxide is the proper name for rust. Geddit?
The traditional hard disk dates back more than 60 years: in technology terms that’s properly old. Early hard disks had a capacity of sub-5MB; today you can put a 12TB disk in your desktop machine – that’s a couple of billion times as much as the distant ancestor.
There are two basic problems with mechanical devices. First, they break: look at where your desktop PCs and servers generally fail and it’s the fans, the hard disks and the CD/DVD drives – mechanical equals friction equals wear equals failure. Second, they’re relatively slow: take a 7,200rpm drive and if you’re unlucky it’ll take about 8ms for the bit you need to whizz under the head – which is a lifetime in computer speeds.
These aren’t exactly the kinds of speed for a business relying on its data for massively parallel, high-end applications requiring random access to data all over the storage. Now, you can be constructive about mitigating the delay – by having many platters and a head for each and then by having multiple disks. But you still have to wait for the spin.
Sixty years on, vendors are making technological advances in spinning drive technology but mechanical disks are still, relatively speaking, slow.
What do we mean by slow? Well, the typical access time for a solid-state disk (SSD) – now the not-so-new technology kid on the block – is generally sub-10µs. That’s one four-hundredth of the access time of the old hard disk: which means that it’s the interface connecting the thing to the host machine that becomes the bottleneck.
So, there’s got to be a reason why we don’t all just buy SSD instead of traditional hard disk drives (HDDs).
One of the potential issues is that the individual storage cells of an SSD have a finite lifetime, which is measured in the number of read/write cycles it undergoes: in reality, though, the vendors are increasingly able to offset this with clever controller technology that ensures the cells are used evenly and that duff cells are removed from use.
And there really is only one thing that puts us off: price. Quick comparison: a 1TB SATA HDD picked at random from a well-known online shop named after a river rings up the till at £40. A 1TB SSD? £200 - that’s five times the cost.
But even if you bite the bullet and embark on an SSD love-fest, your legacy HDD infrastructure that you’ve had for ages can exist quite happily next to it for now. But that’s not because it’s the best thing for the job: it’s because you’ve had it for ages and while it’s still good enough (after all, 8ms isn’t exactly sloth-like), stable enough and supported by the vendor there’s no point getting rid of it. And it’s bound to be good enough for a while yet, for modest-performance tasks – which means that even if you’re buying the fastest storage you can find for your new apps, you don’t just chuck the old stuff in the bin. After all, HDDs are inefficient if you’re using them for wildly varied data access that makes the read/write heads leap all over the disk: but the less random your access, the more the performance gap closes between HDD and SSD because you’re reducing the use of the mechanical (=slow) features of the disks.
Would you buy new HDD when it’s time to expand your storage world, though? Well, you should think quite hard before going for something other than SSD. Why? SSD is becoming more affordable, and because before long it’s going to stop being optional.
Manual systems are rapidly becoming a thing of the past. Everything’s moving to a technological platform (so-called “digitalisation” – someone please introduce me to whoever coined that term so I can kick them in the shins). Data-centric (and again please) operation is becoming the norm. AI and machine learning rely on vast volumes of data. Applications are demanding more processing power, more memory, and more storage with faster access times. There’s only one way the technology and applications will go, and that’s not in a direction where they can live with the storage subsystem getting slower.
And yes, it’s going to cost you money: but it’s increasingly affordable money. I have an SSD in my laptop because it gives me benefits over an HDD but is still within my price range. And that’s the point: SSD is more expensive, but it’s affordable.
Not only that but it’s super-fast, it’s reliable, the form factor is small (equals fewer cabinets to rent in the data centre) and the power consumption is minuscule relative to HDD. So, something that initially seems to be four or five times more expensive than its legacy equivalent suddenly nets off to being more like two or three times steeper instead. Which brings it further into the realms of affordability.
Well, in the average case it does, but not in every case. Going back to the point about cost, an acquaintance of mine sells storage kit for a living: many of his customers measure their purchases in petabytes, rather than the terabytes most of us are used to. At the relative price difference that I mentioned a few sentences ago, the cost increment becomes a five-figure sum, even if you do have a few more drive failures because of the mechanical nature of the kit: HDD is well and truly back on the menu in such cases if performance isn’t the dominant requirement.
SSD has, then, very much become the default choice – in my world, anyway. While there are instances where spinning rust is still the way to go, these are becoming fewer and farther between. And before very long it’ll become very hard to justify buying traditional disks, because the drawbacks in an environmentally aware world of data-hungry apps will outweigh the up-front cost savings.
That doesn’t mean, however, that we won’t still have HDD in our data centres for many years to come: tending toward buying SSD for the new stuff doesn’t mean we’re going to throw the HDD setup in the bin until we’ve had the value from it. It’s therefore worth assuming we’ll live in a hybrid world for some time to come, then, and ensure that the way we design our apps and architect our infrastructure caters for this fact.
And hybrid’s a perfectly fine way to go, so long as you can manage it coherently. If your HDD infrastructure can deliver enough IOPS for the applications it serves, that’s good enough: and it means you can dedicate the pricier SSD to the hungry apps that eat storage for breakfast.
And if the management layer provides you with a decent – and fast – capability to shuffle file stores between one storage type and another then even better, because that’s one of the biggest pains with wrangling storage and it lets you deal with changing needs of individual apps.
From a technology perspective, SSD is the future. Thanks to its footprint and penetration, however, HDD - and therefore hybrid data storage – are the present.
And that might not be bad thing. Yes, SSD brings performance and reliability thanks to the fact it doesn’t contain moving parts, but - for now – you can still live with HDD. Not every workload requires AI-levels of read, write or retrieval thrown at it. Plus the tools exist to manage hybrid.
Just keep remember to keep a weather eye fixed firmly on that future - and have a plan to get there.