Dimension Data cloud goes TITSUP down under... after EMC storage fail
Replacement hardware needed as Australian cloud flops for 48-plus hours
UPDATE Dimension Data's Australian cloud has been down for over 24 hours after EMC kit failed.
The outage is detailed on this status page recording problems with the company's “AU 1” region dating back to the morning of July 2nd. Dimension Data company says the service has since come back in “a degraded state” and that “Services are recovering”.
Between 1:34PM and 6:14PM Sydney time, the status page offered four updates reading “During the recovery process we had another incident with the Storage Processor. The Vendor is working to bring them back online. All recovery processes are halted and client servers are offline.”
Dimension Data sent The Reg the following statement:
"Dimension Data has confirmed that one of its global data centres has suffered a partial outage.
As a result of a failure in the company’s Sydney-based cloud data centre on 2 July 2014, a small number of Dimension Data clients have been adversely affected. Dimension Data is in regular contact with each of the affected clients to keep them updated on the situation.
The incident occurred in a section of the cloud data centre’s core storage infrastructure, and requires significant remediation action, including the replacement of hardware and restoration of services for all affected Dimension Data clients.
Dimension Data has rectified the root cause of the failure, and the company’s major incident team is continuing to restore services during the course of today, 4 July 2014. Complete service restoration is expected in the early hours of Saturday morning AEST, 5 July 2014."
Dimension Data has in the past named EMC as the exclusive supplier of tiered storage for its cloud, in this document. In other documents describing its public compute as a service such as this PDF names EMC too.
EMC has since confirmed it is the source of the problem and sent us the following statement:
“EMC is providing the highest levels of support to Dimension Data to restore services to our customers. The EMC and Dimension Data teams are working tirelessly to ensure the affected customers receive full service restoration.”
Those efforts seem to be delivering the goods. As of 9:58PM there was just one server down, but restoration of virtual machines appears to be ongoing.
As of 2:39 AM on July 5th, Dimension Data advises the service is again fully operational. ®
Dimension Data has sent another statement, as follows:
Dimension Data has announced that the partial outage which occurred in its Sydney Managed Cloud Platform (MCP) has been restored, and all servers that were impacted by the storage array failure are now recovered. All client services are fully operational.
Dimension Data said a full post-incident review is already underway.
Sponsored: DevOps and continuous delivery