Facebook Data Center: If it won't run ARM, what will it run?

Zuckerberg to unmask shiny new backend

The case for Intel

When you consider this memory limitation – and that fact that Facebook has a close relationship with Dell – it's unlikely the social networking giant will embrace SeaMicro anytime soon. Tilera machines could happen, but there are obstacles here as well.

With most hyperscale data-center operators, such as Google, Facebook, Yahoo!, and Microsoft, the idea is to standardize on a very limited number of servers with precise configurations. By standardizing hardware, workloads can move more fluidly across machines. Plus, support is easier. For similar reasons, telecommunications companies have for decades rigidly standardized their switching equipment, often ignoring new technology improvements in favor of simplicity.

Given this, Facebook may be hesitant to introduce a new instruction set from someone like Tilera. The economics of the platform would have to be dramatically better than the x86 architecture. It would require corralling a portion of Facebook's workloads on Tilera machines and making them less portable and usable on the rest of its servers.

That said, if Facebook has a new workload in mind, a workload that isn't going to run across all data centers anyway, a Tilera architecture might work just fine. Let's say, for instance, that Facebook wants to launch a, well, Google-battling search engine. A data center of Tilera boxes might be better at this particular task than its existing x86 machinery in terms of performance per watt or dollars per performance per watt.

But we're guessing that Facebook is moving to Intel micro servers. With a micro server, you put four sticks of memory and a single socket onto a small motherboard and then cram as many as you can into a rack chassis that has shared power and dedicated disks for each server node. This is not to be confused with a blade server. It doesn't have a shared backplane for linking to networking and a central management coprocessor in the blade chassis. And if we're lucky, micro servers will adhere to the SSI Forum standard, so that you can mix and match models from disparate vendors. Blade servers never developed a standard.

At Intel's press event, Coglitore said that Facebook would not move to a 32-bit memory footprint, which limits you to 4GB of memory per server node. This, he said, does not make sense for existing Facebook workloads. That precludes the use of SeaMicro's Atom-based machines or any ARM processor. But the second generation of Tile-Gx processors from Tilera have 64-bit addressing, so this is at least a possibility. And the single-socket Xeon E3 chip offers 64-bit memory addressing, so it can support 64GB of memory in four slots using 16GB memory sticks.

Brawny v Wimpy

Facebook has long leased data center space from third-parties, but in January 2010 it announced that it would build its own facility in Prineville, and the primary aim of the facility is to improve efficiency.

"We are designing a facility that will be highly efficient and cost-effective for our operations today and into the future," Jonathan Heiliger wrote in a blog post at the time. "As our user base continued to grow and we developed Facebook into a much richer service, we reached the point where it was more efficient to lease entire buildings on our own. We are now ready to build our own."

Facebook has said it will use outside air and evaporated water to cool the facility, forgoing chillers. But it only stands to reason that the company will seek to improve efficiency at the system level as well.

The company is following in the footsteps of Google. Mountain View now runs around 36 custom-built data centers across the globe, including at least one that runs without chillers, and it has long built its own servers and racks in an effort to improve efficiency.

But Google uses nothing but commodity hardware, including standard server chips from Intel and AMD. And in a paper published last year, Google senior vice president of operations Urs Hölzle warned against taking parallelization too far, downplaying the benefits of extreme multi-core processors. He said that chips that spread workloads across more energy-efficient but slower cores may not be preferable to chips with faster but power-hungry cores.

"Slower but energy efficient 'wimpy' cores only win for general workloads if their single-core speed is reasonably close to that of midrange 'brawny' cores," he said, warning that wimpy cores run into Amdahl's law, which says that when you parallelize only part of a system, there is a limit to performance improvement.

"So why doesn’t everyone want wimpy-core systems?" Hölzle wrote. "Because in many corners of the real world, they’re prohibited by law – Amdahl’s law. Even though many Internet services benefit from seemingly unbounded request- and data-level parallelism, such systems aren’t above the law. As the number of parallel threads increases, reducing serialization and communication overheads can become increasingly difficult. In a limit case, the amount of inherently serial work performed on behalf of a user request by slow single-threaded cores will dominate overall execution time."

In publicly backing Tilera and SeaMicro, Facebook's Heiliger seems to take a different view. But we find it hard to believe the company will go that far. But stranger things have happened. And the company has said it's introducing a software change as well. One thing's for sure: it wouldn't even be semi-accurate to say the company is moving to ARM. ®

Sponsored: Network DDoS protection