IBM's global load balancer and reverse DNS degraded by domain transfer mess
Registrar put crucial domains in limbo. Is that good enough for a big cloud?
IBM's cloudy global load balancer and reverse DNS services have been impacted by a DNS mess inflicted on Big Blue by a domain name registrar.
In an email to customers, IBM says that on September 6th “During a bulk transfer of domain names between two domain registrar services, two domains (global-datacenter.com and global-datacenter.net) were inadvertently put in client hold state by the sending registrar, but were not transferred over to the receiving registrar.”
WHOIS searches suggest both resolve to ns1.softlayer.net and ns2.softlayer.net
The mess “caused those domains to become inaccessible. This in turn impacted the global load balancer (GSLB) service reliant on those domain names, as well as the Reverse DNS service.”
WHOIS suggests the domains are managed by an outfit called MarkMonitor, which we have contacted for comment and to verify whether the company had any role in the situation. At the time of writing, the company has not replied to our email and its switchboard routed to voicemail.
IBM's notice to customers says “Cloud Infrastructure is working with the involved domain registrars to resolve this issue and remove the affected domains from client hold state as quickly as possible, which will in turn restore the GSLB and Reverse DNS services completely. Additionally, intermediary corrective actions have been taken to partially restore the Reverse DNS service functionality prior to the domain registrars correcting the situation.”
IBM CIO leaves for AWS – and Big Blue flings sueball to stop himREAD MORE
Suppliers can be a weak link in any chain. However an important element of any cloud provider's value proposition is extraordinary levels of paranoia regarding resiliency, expressed as multiple redundant layers of infrastructure to keep services up during extreme events. That IBM's cloud can be degraded by a supplier's error is therefore unusual. On top of incidents like switching off TLS 1.0 support without giving customers enough notice to cope with the change, which The Register has learned broke customers so swiftly that the cipher was was restored in half an hour, it paints an interesting picture of Big Blue's cloudy capabilities.
IBM is not alone in having trouble with its cloud. The Register has found at least five occasions on which Google broke its own cloud with an imperfect update. Azure suffered a similar problem earlier this week, with cloudy Active Directory offering degraded performance. And of course AWS broke its own S3 cloud storage service and pulled down a good chunk of the web for days.
But those three clouds mostly shrugged off the outages. The stakes may be higher for IBM, as analyst firm Gartner recently rated IBM's cloud “missing many cloud IaaS capabilities required by midmarket and enterprise customers” and at risk of missing deadlines to add features to bring it to parity with competitors' offerings. ®