HPE blames solid state drive failure for outages at Australian Tax Office
'Rare issue under a set of circumstances that have never previously been encountered'
HPE has blamed a problem with solid state drives for its dual and very disruptive outages at the Australian Taxation Office (ATO).
A spokesperson for the company told The Register that “We believe the disruption started when a solid state drive used by major storage vendors failed. HPE and the drive vendor have determined that the condition was triggered by a rare issue under a set of circumstances that have never previously been encountered.”
The spokesperson added that “it would not be appropriate to further speculate on the architectural changes that might be made or additional redundancy solutions that might be deployed to improve the inherent reliability of the ATO systems.”
Neither HPE nor the tax office has previously mentioned a solid state drive problem, so let's parse things a bit here.
To your correspondent's mind the phrase “ a solid state drive used by major storage vendors” could be two things. Firstly, it could be HPE pointing out that it shops from reputable sources. Or it could be HPE trying to throw a vendor under a bus apportion blame for the outages.
It's hard to say who provided the dodgy drive. What we can say is that Samsung is the world's leading enterprise SSD supplier: it trumpeted the fact in August 2016, claiming between 32 per cent and 45 per cent market share based on different analysts' estimates. And we know that Samsung almost certainly works with HPE, because the latter has advertised it can support 15TB SSDs and those mostly come from Samsung. We also know that solid state drives are in short supply and that the ATO stores more than a petabyte of data in its HPE kit. One more nugget: HPE storage boxen can run SSDs from 400GB up the new 15TB monsters. Combine a shortage, a big user, the chance to use disks of different sizes and it could be anyone's drive that went bung. The Register is approaching leading SSD suppliers to ask if they're aware of the ATO situation, or helping out in any way.
Lastly, let's consider the mention of “architectural changes that might be made or additional redundancy solutions that might be deployed to improve the inherent reliability of the ATO systems”. Might that suggest HPE's kit has missed a trick? Or that the ATO's rig could have been configured differently to guard against even exotic failures?
We'll know more in March, when the PwC report into the incident emerges. ®