LSI driver bug is breaking VSANs, endangering data
More VSAN hardware trouble for VMware
VMware says its VSAN virtual storage array is selling well, earning hardware-makers' attention and making plain the wisdom of the software-defined data centre.
It may well be, but VSAN is also having some teething problems. Back in July, VMware was forced to change its recommended VSAN system configurationsbecause VSANs were choking on suggested setups.
Now comes news, thanks to VMware partner SynchroNet, that an LSI component used by several server-makers, is causing some VSANs to fail.
The component in question is the LSI 2208, a RAID controller SynchroNet's John Nicholson says is used by the likes of Dell, HP and Cisco in their VSAN boxen. The 2208 is useful because it enables “pass-through” mode whereby the RAID controller is able to optimise use of spinning rust. Without a pass-through-enabled controller, VMware says “VSAN will not function efficiently” and “Performance on the VSAN datastore will not be maximized in this configuration.” Nicholson says the bug isn't persistent, and instead “affects every host in the cluster approximately 40 days after its last reboot”. Once the hosts go down, they stay that way for about half an hour, during which time disks drop out and data appears to be lost.
Nicholson's post suggest the problem's been known for a while, and suggests he's been told by VMware that for now the best workaround is dropping to RAID 0.
Another SynchroNet staffer, has posted to Reddit as “rkap” suggesting “I know within VMware, that this has the highest level of visibility possible”.
As “rkap” points out, this isn't really VMware's problem. LSI's driver appears to be at fault and it's up to that vendor to set this to rights. LSI's site and social media emissions are silent on the topic. ®