The Register® — Biting the hand that feeds IT

Feeds

Google says it's sorry for Monday's hours-long Gmail delays

Dual networking outage won't happen again, honest

Free ESG report : Seamless data management with Avere FXT

Google apologized on Tuesday for a networking glitch that prevented emails from reaching many Gmail users' accounts for as much as two hours or even longer.

"The message delivery delays were triggered by a dual network failure," Gmail site reliability engineer Sabrina Farmer wrote in a blog post. "This is a very rare event in which two separate, redundant network paths both stop working at the same time."

Email delivery broke down at around 5:54am Pacific time on Monday, Farmer said, and the online ad giant didn't get on top of the problem until around 1pm the same day. The full message backlog wasn't cleared until around 4pm.

The service interruption only affected around 29 per cent of messages passing through Gmail, Farmer said, and of those, the typical message was only delayed by 2.6 seconds. But some messages were left hanging much longer, and in the worst cases – about 1.5 per cent of the total – they were delayed for more than two hours. In addition, some users who tried to download large attachments from their Gmail accounts experienced errors.

A contrite Farmer expressed Google's regrets. "We realize that our users rely on Gmail to be always available and always fast, and for several hours we didn't deliver," she wrote.

But to be fair, she said, even several hours' worth of spotty networking had negligible impact on Gmail's overall uptime stats. "Gmail remains well above 99.9% available," she wrote, "and we intend to keep it that way!"

Users weren't locked out of their Gmail accounts during the incident, and they were able to read email that had already been delivered and even send new messages of their own.

Gmail has certainly dealt with worse. In 2012, a misconfigured sync server triggered a system-wide Gmail outage and caused the Chrome browser to spontaneously crash at the same time. And just last month, all of Google's services mysteriously vanished from the net at once, causing not just Gmail but 40 per cent of internet traffic to go dark for a few minutes.

Still, Farmer says Gmail will be updating its network capacity and adjusting its infrastructure so that mail delivery will be "more resilient," even in the event of a dual network failure. What's more, the Chocolate Factory will rejigger its internal practices so that the next time something like this happens, its engineering teams will be quicker to respond.

Shame on you, Google. Shame on you. ®

Email delivery: Hate phishing emails? You'll love DMARC

Whitepapers

5 ways to reduce advertising network latency
Implementing the tactics laid out in this whitepaper can help reduce your overall advertising network latency.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Email delivery: 4 steps to get more email to the inbox
This whitepaper lists some steps and information that will give you the best opportunity to achieve an amazing sender reputation.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
5 ways to prepare your advertising infrastructure for disaster
Being prepared allows your brand to greatly improve your advertising infrastructure performance and reliability that, in the end, will boost confidence in your brand.

More from The Register

next story
EE still has fastest, fattest 4G pipe in London's M25 ring
RootMetrics unfurls crowd-sourced 4G coverage map
Report says PRISM snooped on India's space, nuclear programs
New Snowden doc details extensive NSA surveillance of 'ally' India
Highways Agency tracks Brits' every move by their mobes: THE TRUTH
We better go back to just scanning everyone's number-plates, then?
Google tentacle slips over YouTube comments: Now YOUR MUM is at the top
Ad giant tries to dab some polish on the cesspit of the internet
Reg readers! You've got 100 MILLION QUID - what would you BLOW it on?
Because Ofcom wants to know what to do with its lolly
Google says it's sorry for Monday's hours-long Gmail delays
Dual networking outage won't happen again, honest
T-Mobile US exec mulls merger with rival Sprint
Only a larger company can take on Verizon, AT&T, moneyman says
prev story