If Carlsberg did cloud outages, they'd probably look like ConnectWise's
Platform admits failover cluster fell over but they're really sorry and, here, have a credit note
Biz automation platform ConnectWise has issued a credit note to disgruntled customers caught up in last week's day-long outage that the firm blamed on a wobble in its "highly resilient" cloud infrastructure.
Systems went down at around 7:30 BST in the morning on Friday 5 May, preventing users from even logging in, and were not restored until 17:00 BST.
The company was forced to restore systems from backups, which meant many clients lost data. A lack of communication left users in the dark as to what the problem was or when it was likely to be rectified, reducing their ability to minimise the impact on their customers' systems.
ConnectWise fails to connect: Customers down and out in the EUREAD MORE
Most readers who contacted us said staff did their best but were frustrated at how difficult it was to squeeze information out of ConnectWise. If they had been told to resort to backups, customers could have got on with (redoing) work as required.
ConnectWise fingered a "cloud outage that affected all EU Cloud Manage Partners", adding: "Our cloud infrastructure is highly resilient and this is an extremely unusual occurrence."
The company status page normally shows no problems or just one or two issues per month impacting only some customers. But March and April were dotted with challenges. ConnectWise was sold in February to a private equity investor.
In a more detailed statement about last week's outage, the company told us:
ConnectWise apologises for any disruption that the EU zone cloud outage on Friday, May 3 may have caused our partners, but we are happy to report that the major issues were remediated within a few hours. Here is a brief recap of what occurred and the steps we took to mitigate it:
ConnectWise experienced a failure of one of our critical systems at 7:30am BST Friday, and we were unable to failover to our secondary SQL cluster. We restored service from the daily backup that had been taken Thursday evening between 8:30 and 9:30pm BST. By 4:21pm BST Friday, all EU Cloud Services had been restored. However, based on this time gap, any tickets, time entries, invoices, etc entered between 8:30pm BST Thursday and 7:30am BST Friday would need to be recreated.
The PR handler said work is still being done to "identify and eliminate all contributing factors to the failure and time to recover". She added: "We are exercising all due caution in what we announce for purposes of accuracy and to maintain the confidentiality of the internal architecture of our environment that would need to be included."
ConnectWise's infrastructure is hosted by AWS – but ConnectWise is responsible for running its environment including backups and maintenance.
Addressing customer complaints about radio silence during the outage, the automation biz said it is "taking steps to improve our communications, particularly with respect to ensuring timely notification to our EU partners when this type of business disruption occurs".
The PR rep at ConnectWise also wished to clarify that the outage was not linked to any planned maintenance mentioned in our original story. That was carried out on the ConnectWise Control product only and had no connection to the subsequent problems with ConnectWise Management in the EU. Likewise, the problems were not related to any issues the firm experienced in both March and April.
A Connectwise customer contacted us this morning to say they had been offered a credit note due to the "inconvenience" caused by the unscheduled downtime. The letter from ConnectWise CEO Jason Magee stated the credit is a "gesture of thanks for your business". The monetary value was not specified. ®
Sponsored: What next after Netezza?