Feeds

Facebook knuckle-raps Intel, AMD

'The server vendors have failed us'

The Essential Guide to IT Transformation

While social networking site Facebook doesn't seem inclined to pull a Google and build all of its own servers, the company's top techies are frustrated enough with the current crop of x64 boxes that they may be giving the idea some thought.

Speaking at the Structure 09 conference in San Francisco yesterday, Jonathan Heiliger, vice president of technical operations at Facebook, first expressed his disappointment with chip makers Intel and AMD and then server vendors in addressing the infrastructure challenges the company faces. (You can see a video of Heiliger's chat here).

"The biggest thing that surprised us - or is about to surprise us - is the less-than-anticipated performance gains from new microarchitectures," Heiliger explained in a Q&A session with GigaOM's Om Malik, referencing the new chips from Intel and AMD in particular. "The performance gains they are touting in the press, we are not seeing in our applications. We are literally in real-time trying to figure out why that is and if there are optimizations that we can do. Otherwise, we are kind of left with current-generation technology and current-generation scale."

Of course, it's unlikely that Facebook will return to using systems from one or two generations ago. But the situation may point out that the performance gains that Intel and AMD are showing on tests may not translate well into the heavily customized Facebook stack, which is written in PHP and uses memcached as a temporary cache store for data serving up its pages and MySQL as a permanent store for that data behind the caching servers. Or - and this seems more likely - it may simply point out the limits of the open source software Facebook has chosen to run its site.

Both MySQL and memcached are not particularly good at scaling on many cores and threads, which is why companies like Facebook scale horizontally in the first place. The scalability issue is why a number of vendors have come out recently with multithreaded versions of memcached (the appliance made by Schooner, which launched in April comes to mind), and it is also why Sun Microsystems has been touting that its future MySQL 8.4 (or rather, Oracle's future MySQL 5.4) will scale across 16 processor threads, whereas the current MySQL 5.1 only spans four threads.

With the new two-socket x64 boxes having 12 threads for an Istanbul Opteron and 16 threads for a Nehalem EP Xeon, it should not be all that much of a surprise to Heiliger that new iron is not making Faceook's caching and database tiers run faster. The software can't use the threads. And the only reason that MySQL 5.4 will be able to span up to 16 threads is because Google is contributing code to the MySQL project. (I have no idea how PHP is or is not making use of all those extra threads in two-socket Nehalem EP and Istanbul servers).

With regard to the other pet peeve that Facebook has about its server infrastructure - power consumption - Heiliger heaped a certain amount of scorn on the world's server makers, who are no doubt constantly knocking on his door to try to win the next 10,000 server deal that Facebook does.

"I am not sure whether to be embarrassed or pleased for the OEM and system vendors in the audience," Heiliger said, "but you guys just don't get it." Heiliger said that it was not enough to just have servers with low-voltage chips and more efficient power supplies, but that server makers had to do what Google has done, which is to trim out every watt it can and minimize the wall power that a server consumes as it does its work.

"I am not sure why the server vendors have failed us," Heiliger added when asked why Facebook wasn't getting the machines it really wanted to buy. But he conceded that there is a bit of a chicken-and-egg problem in that server makers have to design for the belly of the market and that they have come around to doing custom designs for hyperscale customers like Facebook for orders of tens of thousands of nodes at a time.

Heiliger's own advice for providing scalability - the kind that a hyperscale company like Facebook needs - was simple, and it is fair to say that maybe Facebook needs to take its own advice a little more literally to be happy. "Pick an area where you are going to own - whether it is the application or the software infrastructure or the hardware infrastructure - and start with that area," Heiliger advised. "As you grow and as you scale, you can add the other two. You can start anywhere, but I would probably recommend starting with the application first."

Following this advice, it would seem that Facebook needs to buy lots more new x64 servers and to get down into the guts of its open source code and make it more scalable. Or design its own machines that give the best performance per watt on its existing code using older or at least different iron. Might I suggest some racks of Nano or Atom servers to start? ®

The Essential Guide to IT Transformation

More from The Register

next story
Sysadmin Day 2014: Quick, there's still time to get the beers in
He walked over the broken glass, killed the thugs... and er... reconnected the cables*
Auntie remains MYSTIFIED by that weekend BBC iPlayer and website outage
Still doing 'forensics' on the caching layer – Beeb digi wonk
Multipath TCP speeds up the internet so much that security breaks
Black Hat research says proposed protocol will bork network probes, flummox firewalls
Microsoft says 'weird things' can happen during Windows Server 2003 migrations
Fix coming for bug that makes Kerberos croak when you run two domain controllers
Cisco says network virtualisation won't pay off everywhere
Another sign of strain in the Borg/VMware relationship?
Forrester says Australia, not China, is next boom market for cloud
It's cloudy but fine down under, analyst says
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
The Essential Guide to IT Transformation
ServiceNow discusses three IT transformations that can help CIO's automate IT services to transform IT and the enterprise.
Maximize storage efficiency across the enterprise
The HP StoreOnce backup solution offers highly flexible, centrally managed, and highly efficient data protection for any enterprise.