How a chunk of the web disappeared this week: GlobalSign's global HTTPS snafu explained
Buggy code pulled plug on wrong certificates, basically
GlobalSign has performed a postmortem examination on how, as one of the world's root certificate authorities, it managed to break a chunk of the web.
The New Hampshire, US-based biz has to date sold 2.5 million SSL/TLS certificates to websites around the world. This week, it inadvertently smashed its own chain of trust: it effectively made its customers' certs appear untrustworthy in the eyes of web browsers and apps globally.
This prevented many people from being able to access secure websites and online services large and small, from Wikipedia and the Financial Times to GlobalSign's own servers.
The accidental cockup hasn't affected everyone: if your computer, phone or some other gadget was among the unlucky ones to fetch a dodgy revocation list from GlobalSign's network on Thursday, October 13, your browser will stop you from accessing legit HTTPS websites. That's because your browser has been told GlobalSign-issued encryption certificates are no longer valid.
Tech-savvy netizens hit by the blunder can attempt to clear their revocation list caches and fetch a correction from GlobalSign to fix the problem. Less savvy folks affected are left with baffling browser error messages. If you're not seeing any complaints on your screen when browsing the web, your computer or phone may have picked up the correction by now, you may be visiting websites that do not use GlobalSign certificates, or your browser managed to dodge the small crisis entirely.
Initially, GlobalSign blamed a programming flaw in web browsers for the cockup – then it realized the problem was within its own systems.
We're told it all kicked off after GlobalSign published a Certificate Revocation List, signed by the organization's Root CA R2 cert, that revoked a cross-certificate and an old subordinate CA that was being discontinued – the subordinate was used to issue out-dated SHA1 Extended Validation SSL certificates.
A cross-certificate can, generally speaking, be used to improve trust and reliability: in GlobalSign's case, the cross-cert allows browsers and apps to verify the integrity of a GlobalSign-issued HTTPS certificate with either GlobalSign's Root CA R1 or Root CA R2 certificate.
The cross-cert therefore creates two possible paths of trust for software to walk along when checking the validity of a website's HTTPS cert. As long as the browser trusts either GlobalSign's Root CA R1 or R2, it will trust the site's HTTPS cert at the end of the chain.
Here's what GlobalSign was trying to revoke – just the cross-certificate and an old subordinate CA:
The cross-cert was issued by GlobalSign's Root CA R2 with the Root CA R1 as the subject. On October 13, six days after the revocation list was published, the org updated a database called the delegated Online Certificate Status Protocol (OCSP) responder database. This feeds information to a collection of systems called the delegated revocation responders.
Unexpectedly, these responders were confused by the revocation of the cross-certificate, and thought GlobalSign was trying to revoke the intermediate certificates linked to the Root CA R1 certificate. These intermediates are used to issue SSL/TLS certificates sold to websites and businesses: by revoking the intermediates, all those customer certs become untrusted – they are effectively null and void. Browsers can no longer verify the identity of secure sites using the GlobalSign certificates, because the chain of trust has been broken, and thus refuse to access the websites.
"Delegated revocation responders incorrectly determined that all Root CA R1 intermediates were 'bad' due to the cross-certificate being revoked by Root CA R2 as the cross had the same Public Key and Subject Name details with a more recent date," explained GlobalSign in an incident report drawn up in the aftermath.
The biz rolled back the changes to the revocation list at its end, but it was too late: the changes were by then propagating through its content-distribution network (CDN) and into applications that were contacting GlobalSign for the latest revocation lists. Some browsers and programs received the correct data while unfortunate apps were given the dodgy list and duly canned the GlobalSign intermediates.
"This situation was rolled back and load balancers and the CDNs were purged but some users retained the ‘bad’ response, while many more still had the previous ‘good’ responses from previous interactions with GMO GlobalSign SSL enabled web sites," the biz explained.
So, it turns out software running on the revocation responders saw that the public key and subject name of the cross-cert matched that of the Root CA R1 certificate, and that the cross-cert had a fresher valid-from date, and figured because the cross-cert was being killed off, all of the R1's subordinate certificates should be axed too:
GMO GlobalSign uses a third party security accredited load balanced OCSP responder system for provision of delegated responses across the product range and outwards to our community of customers and their relying parties. However, and unfortunately for our ecosystem and our stakeholders and their customers, the logic within the responder code base determined that the revocation of the Cross Certificate, identified by its Public Key and Subject Name in a lookup table, was effectively an instruction to also identify all other subordinate certificate authorities including DomainSSL and AlphaSSL as ‘bad’.
The logic took the more recent Not-Before Date (Valid From) of the cross certificate as a later assertion than the original Root CA R1 and therefore determined this as an authoritative instruction to mark all Root CA R1 issued subordinate certificates as ‘bad’.
The vendor of the faulty code was not named – please let us know if you know any better. Below shows how the bad decision was pushed through a layer of load-balancers and distribution systems, which browsers and other client-side software connect to; the changes were not transmitted in a uniform fashion, sparing some devices and computers from picking up the wrong revocation list. Browsers and other applications tend to check for revocations every few days, so if they weren't due to phone in for updates, they will have missed this week's drama.
The full technical details are described here by GlobalSign's team, along with additional support information and an FAQ for anyone still stiffed by the issue.
The biz warns it may take until early next week for the problem to be cleared up completely due to the amount of caching involved and the fact that applications tend to check for updates to revocation lists once every four days. If your software was stung on Thursday, it may not get the antidote until Monday. ®
PS: There are alternatives to GlobalSign for HTTPS certificates – such as Let's Encrypt, which hands out certs for free.