Don't be shy, vendors: Let's see those gorgeous figures
None of that five-nines crap... your actual real-life downtime numbers
Storagebod One of the frustrations when dealing with vendors is actually getting real availability figures for their kit. You will mostly get generalisations, such as "it is designed to be 99.999 per cent available" or perhaps "99.9999 per cent available". But what do those figures really mean to you and how significant are they?
Well, 99.999 per cent available equates to a bit over five minutes of downtime and 99.9999 per cent equates to a bit over 30 seconds downtime over a year. And in the scheme of things, that sounds pretty good.
However, these are design criteria and aims - what are the real world figures? Vendors, you will find are very coy about this; in fact, every presentation I have had with regards to availability are under very strict NDA and sometimes not even notes are allowed to be taken. Presentations are never allowed to be taken away.
Yet, there’s a funny thing ... I’ve never known a presentation where the design criteria are not met or even significantly exceeded. So why are the vendors so coy about their figures? I have never been entirely sure; it may be that their "mid-range" arrays display very similar real world availability figures to their more "enterprise" arrays ... or it might be that once you have real world availability figures, you might start ask some harder questions.
Sample size: Raw availability figures are not especially useful if you don’t know the sample size. Availability figures are almost always quoted as an average and unless you have a really bad design, more arrays can skew figures.
Sample characteristics: I’ve seen vendors do some really sneaky things when backed into a corner to provide figures. For example, they may provide figures for a specific model and software release. This is often done to hide a bad release. You should always try to ask for the figures for the entire life of a product - this will allow you to judge the quality of the code. If possible ask for a breakdown on a month-by-month basis annotated with the code release schedule.
There are many tricks that vendors try to pull to hide causes of downtime and non-availability but instead of focusing on the availability figures, as a customer, it is sometimes better to ask different specific questions.
What is the longest outage that you have suffered on one of your arrays? What was the root cause? How much data loss was sustained? Did the customer have to invoke disaster recovery or any recovery procedures? What is the average length of outage on an array that has gone down?
Do not believe a vendor when they tell you that they don’t have these figures and information closely and easily to hand. They do and if they truly do not it would mean a fair amount of negligence on their part when it comes to quality control and analytics. Surely they don’t just use all their Big Data capability to crunch marketing stats? Scrub that, they probably do.
Another nasty thing that vendors are in the habit of doing is forcing customers to not disclose to other customers that they have had issues and what they were. And of course we all comply and never discuss such things.
So, five minutes - that’s about long enough to ask some awkward questions. ®