Feeds

Google's Postini Fail pinned on bad filter, hardware glitch

Oh, and 'malformed types of messages'

Website security in corporate America

The extreme email delays the plagued users of Google's Postini message management service earlier this week were caused by a shoddy email-filter update and a power-related hardware failure involving the company's database storage servers.

Today, the Mountain View Chocolate Factory released an "incident report" to Postini users, saying the "severe mail flow issues" began at 11:30pm Pacific time on Monday and extended through at least 12:30am Pacific on Wednesday. That puts the email snafu past the 24 hour mark.

The report does not say how many users were affected. Google tells us the problem was limited to customers on Postini's "System 7," one of several systems running the hosted email security and spam-filtering service, but at least one customer says the problem extended to System 5 as well.

"My company is on System 5 and our email was pretty much non-existent until we switched to a backup system. Once we pulled Postini out of the loop, all of that deferred mail hit our system (along with quite a lot of spam)," said Russ Meyer of the US-based Midland Paper.

At one point, Google rerouted traffic to another data center, which could explain the delays seen by Meyer.

Unlike so many on System 7, however, Meyer and Midland never had problems visiting the service's web-based admin console, which Google switched off for some customers in an effort to boost mail flow.

On Monday evening, after Google's monitoring systems detected the problem, engineers rerouted mail traffic from what the company calls a secondary data center. But this didn't help. So they returned some of the traffic back to the primary facility "to maximize processing resources." Then, at least for some users, they shut-off the admin console and some other web interfaces in an effort to reduce the strain on those resources.

Eventually, Google engineers decided the problem was down to three things:

  • A new filter update appears to have inadvertently impacted the mail processing systems.
  • Unusual malformed types of messages triggered protracted scanning behavior, and its interaction with filter update affected mail delivery.
  • A power-related hardware failure with database storage servers reduced input/output rates. The latency in database access reduced our overall processing capacity.

Which sounds like two things to us. Surely, it's the service's duty to deal with "malformed types of messages" - whatever those are.

"The combination of these conditions resulted in high failure rates for mail processing and the deferral of new connections from sending mail servers," Google's report says.

On Tuesday evening, a day after the delays first hit, engineers replaced the faulty hardware - with help from the vendor - and at 11pm Pacific, Google says, database disk throughout returned to normal. Then, an hour later, Google removed the offending filter update, and according to company, mail processing was back on track.

Google continued to process traffic across both data centers for another hour. The company does say, however, that users may still experience delays. "Although mail processing was at normal speed and capacity, some users may have seen delayed messages continue to arrive in their inboxes. These potential delays occur when the initial or subsequent delivery attempt is deferred and the sending server waits up to 24 hours before resending the same message." This explains complaints we received on Wednesday afternoon.

The report says no messages were bounced or deleted.

Originally, Google indicated the problem was limited to US users, but yesterday, the company acknowledged that at least some European users were affected as well. ®

Protecting against web application threats using SSL

More from The Register

next story
New 'Cosmos' browser surfs the net by TXT alone
No data plan? No WiFi? No worries ... except sluggish download speed
'Windows 9' LEAK: Microsoft's playing catchup with Linux
Multiple desktops and live tiles in restored Start button star in new vids
iOS 8 release: WebGL now runs everywhere. Hurrah for 3D graphics!
HTML 5's pretty neat ... when your browser supports it
'People have forgotten just how late the first iPhone arrived ...'
Plus: 'Google's IDEALISM is an injudicious justification for inappropriate biz practices'
Mathematica hits the Web
Wolfram embraces the cloud, promies private cloud cut of its number-cruncher
Mozilla shutters Labs, tells nobody it's been dead for five months
Staffer's blog reveals all as projects languish on GitHub
SUSE Linux owner Attachmate gobbled by Micro Focus for $2.3bn
Merger will lead to mainframe and COBOL powerhouse
iOS 8 Healthkit gets a bug SO Apple KILLS it. That's real healthcare!
Not fit for purpose on day of launch, says Cupertino
Not appy with your Chromebook? Well now it can run Android apps
Google offers beta of tricky OS-inside-OS tech
prev story

Whitepapers

Secure remote control for conventional and virtual desktops
Balancing user privacy and privileged access, in accordance with compliance frameworks and legislation. Evaluating any potential remote control choice.
WIN a very cool portable ZX Spectrum
Win a one-off portable Spectrum built by legendary hardware hacker Ben Heck
Intelligent flash storage arrays
Tegile Intelligent Storage Arrays with IntelliFlash helps IT boost storage utilization and effciency while delivering unmatched storage savings and performance.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Beginner's guide to SSL certificates
De-mystify the technology involved and give you the information you need to make the best decision when considering your online security options.