More like this

Data Center

Arrow

Storage

Commodity flash just as good as enterprise drives, Google finds

Flash failures put drives on a slippery slope, Ad giant finds after crunching SSD stats

If you're loading up a heap of flash drives for your data centre, don't bother with “enterprise-class” SLC (single level cell) technology, because cheaper MLC (multi-level cell) drives will do the job just as well.

However, the data centre biz needs new techniques to predict drive failures, because the unrecoverable bit error rate (UBER) sysadmins watch to spot spinning rust going to sleep is useless for flash media.

Those are two conclusions in Google-backed research from the University of Toronto, Flash Reliability in Production: The Expected and the Unexpected, presented to last week's Usenix FAST 16 conference.

Bianca Schroeder worked with Googlers Raghav Lagisetty and Arif Merchant to slice and dice more than six years' worth of production data on “many millions of drive days, ten different drive models, different Flash technologies”, gathered from Google's vast fleet of computing devices.

Their paper (PDF), published last Friday, finds that while flash storage has a lower replacement rate in the field, conventional hard drives develop far fewer unrecoverable errors.

“More than 20 per cent of flash drives develop uncorrectable errors in a four year period”, they write, and up to 80 per cent will develop bad blocks in the same period. By comparison only 3.5 per cent of hard disk drives (HDDs) will develop bad sectors in 32 months, which on a linear scale would equate to 5.25 per cent over four years.

The group say they're working on a model suitable for flash drives, noting that previous errors and drive age are much better predictors of future failure.

Data centre sysadmins could also benefit by pretesting drives before they go into production, because “a drive with a large number of factory bad blocks has a higher chance of developing more bad blocks in the field, as well as certain types of errors”.

Schroeder's previous work with Google includes a 2009 paper in which she found that heat-stress tests on memory components don't offer much insight into how they perform in the field. Also last week, Google called for disk-makers to develop new classes of product, partly because HDDs remain more reliable than SSDs. ®

Sponsored: Accelerated Computing and the Democratization of Supercomputing