The Register® — Biting the hand that feeds IT

Feeds

Facebook blames outage on internal config flaw

  • alert
  • print

Cascading failure feedback loop calamity

Watch Now : Virtual Machine Movement with Hyper-V

Facebook has published a detailed explanation of an internal configuration flaw that left the site unavailable for around two and a half hours overnight - the social network's worst downtime in four years.

The outage stemmed from a cascading series of problems involving an error correction system that feed into a feedback loop that only cutting traffic to a database cluster and rebooting the site could solve.

The social network apologised for the downtime, which affected servers worldwide, and promised to redesign the faulty system it used to correct configuration values to prevent future problems in the area. In the meantime, this system has been taken out of commission.

Facebook's statement can be found here. Arbor Network's chart of traffic flowing to Facebook can be found here.

Thursday night's outage follows similar but less severe problems the day before. ®

Watch Now : Virtual Machine Movement with Hyper-V

It makes me cross

Each time Facebook publishes any kind of technical information, their post is plagued by hundreds of people commenting and claiming to know how to do it better or fix it. And generally they're talking absolute crap.

Facebook may not be perfect but they know what they're doing.

I'm not sure why it makes me so cross - but it does. It really does!

7
1
Anonymous Coward

get a life ?

I laugh at all the people telling facebook users to "get a life" etc.

Ha. Ha. Ha.

I got exactly the same sort of glib condescending shyte from people when I used Cix for the first time, fidonet or indeed email/www.

5
1

Standard solution

I am not particuarly gifted in the ways of mult datacentre server management but the ultimate solution appeared to be to turn it off and on again.

Outsourced to Renholm Industries?

2
0

Mr.

.Facebook uses Akamai for the static files, such as photos, images, etc.

They don't use it for the main www site normally.

However, yesterday, during the outage, they changed the DNS entry for www.facebook.com to point to:

root@northway# host www.facebook.com.

www.facebook.com is an alias for sorry.ak.facebook.com.edgesuite.net.

sorry.ak.facebook.com.edgesuite.net is an alias for a1030.g.akamai.net.

a1030.g.akamai.net has address 92.122.127.27

a1030.g.akamai.net has address 92.122.127.33

As they said, they needed to stop all traffic to fix the problem, so temporarily diverting to their network of akamai servers seemed to be way they chose to do it

2
0
Anonymous Coward

clearly

Superman was spinning it during the downtime.

2
0

Hands on with Hyper-V 3.0 and virtual machine movement

Our award-winning Regcasts have teamed up with training provider QA for the deepest of deep dives into Hyper-V, including a live demo.

Understand VM movement - just click to play, or go here for a bigger version.