Tilera routs Intel, AMD in Facebook bakeoff

Social Memcached marketing

Designing a Defense for Mobile Applications

Facebook may not think of itself as a social marketing company, but for upstart server-chip maker Tilera, the social media giant's internal Memcached bakeoff pitting Xeon and Opteron machines against Tilera boxes is a marketing windfall, indeed.

Facebook's Memcached performance paper, being presented at the International Green Computing Conference in Orlando, Florida, details how Facebook tested the mettle (surely metal?) of the current generation of TilePro64 many-cored processors against off-the-shelf servers using Intel Xeon and AMD Opteron processors.

Tilera, SeaMicro, and Calxeda have been waving the microserver banners for Hadoop data munching, Memcached Web caching, and other hyperscale Internet workloads for which having a big, fat, powerful processor core is not always as important as having smart interconnects and core designs when it comes to running these distributed workloads.

SeaMicro, which builds a dense box based on dual-core, 64-bit Atom servers that crams 768 cores into a 10U chassis, recently showed off some Hadoop unstructured data crunching it had done on its machines, and compared it to the work that could be performed on plain-vanilla Xeon boxes. The SM1000 server they tested running a real-world Hadoop workload at a customer site could do it for about 25 per cent less money than a cluster of Xeon servers, in one quarter of the rack space, and burning one quarter of the juice.

Memcached was created in 2003 by Danga Interactive as a distributed Web cache that stores data in main memory and makes it accessible to Web servers and applications. Memcached is what is called a key-value store, and it is now used by Facebook, Twitter, Zynga, YouTube, Reddit, Flickr, and a slew of hyperscale internet companies that need to serve up data to millions of users, and can't wait for disk drives to do the job.

"Facebook is known as the king of Memcached, and they run the most Memcached servers in the world, as far as we know," Ihab Bishara, director of cloud computing applications at Tilera, tells El Reg. "This is a tier-one customer validating the claims that we have been making for the past year and a half."

Bishara is not authorized to talk about Facebook's server plans or if the company has already installed the Tilera servers made by Quanta Computer inside its production infrastructure. Quanta is, of course, the Taiwanese PC and server manufacturer that has just teamed up with Facebook to help it manufacture its homegrown, open source Open Compute servers, which debuted in April with its Prineville, Oregon data center and which will be updated this summer when new Xeon E5 and Opteron 6200 processors are announced by Intel and AMD.

Facebook did its Memcached tests on the Quanta QS2 rack server – also known as the QSSC-X5-2Q – which crams 512 cores across eight processors in a 2U rack-mounted chassis.

Each processor is implemented as a single node, so the Quanta server is really an eight-node microserver. Four of the cores on the 32-bit TilePro64 processor were allocated to run Linux, leaving the other 60 cores to run the Memcached workload. The cores, which are widely believed to be a derivative of the MIPS architecture, ran at 866MHz and have several mesh interconnects to glue together memory and I/O across the cores. (See this story for details on the Tile family of chips.) The TilePro64 server node had 32GB of main memory.

Facebook lined up the Tilera-based Quanta servers against a number of different server configurations making use of Intel's four-core Xeon L5520 running at 2.27GHz and eight-core Opteron 6128 HE processors running at 2GHz. Both of these x64 chips are low-voltage, low power variants. Facebook ran the tests on single-socket 1U rack servers with 32GB and on dual-socket 1U rack servers with 64GB.

All three machines ran CentOS Linux with the 2.6.33 kernel and Memcached 1.2.3h.

There's a lot of very detailed Memcached performance information in the Facebook paper that describes how the TCP and UDP protocols affect performance on these various machines, but this graph is a good snapshot of how the machines stack up:

Facebook Tilera memcached test 1

Memcached performance on Opteron, Xeon, and Tilepro64 servers

As you can see, the capacity in transactions per second is not very good for the x64 servers when it comes to Memcached scalability. On the Opteron machines, for example, going beyond four cores actually hurts performance and adding a second CPU gets you precisely nowhere.

The Xeon chips do a little bit better, but adding the second processor also gets you nothing. It would be better to scale up multiple single-socket Opteron or Xeon nodes – as Quanta is doing with the Tilera chips.

But what it immediately obvious is that – at least compared to low-power, low-core Opterons and Xeons – the TilePro64 with 30 cores can meet or beat what these x64 chips can do. And with 60-cores dedicated to Memcached, the TilePro64 crushes the x64 chips.

Obviously both Intel and AMD have more modern processors than these, and soon will have even newer ones. Tilera has just started sampling its Tile-Gx 3000 series of 64-bit, 36-core chips, which will eventually scale up to 100 cores, too.

Performance is only one component of the system issues with which a company like Facebook is dealing, such as electricity use and thermals (two sides of the same coin), physical size, and cost. Facebook shed some light on the power use in its paper, too. Based on the performance estimates for the machines tested, here is how the machines stack up in terms of electricity consumed:

Facebook Tilera memcached test 2

Performance and power consumption of Tilera and x64 servers

Based on these measurements, Facebook then extrapolated how many nodes it would take to build a 256GB Memcached cluster, and then looked at its performance and power efficiency – and Tilera's chips stomped Intel's and AMD's.

An eight-node Quanta server using the TilePro64 chips could handle 2.68 million TPS and burned 462 watts, delivering 5,801 TPS/watt. A four-node Opteron cluster could deliver 660,000 TPS on the Memcached workload while burning 484 watts, delivering a mere 1,363 TPS per watt. And a four-node Xeon (with 256GB in aggregate as well) delivered more oomph than the Opteron machines at 752,000 TPS, and also consumed less power at 400 watts. But those four Xeon machines could only deliver 1,880 TPS/watt – less than a third of the bang per watt of the TilePro64-based machine.

And to top it all off, the TilePro64 machine only took up 2U of space, compared to 4U for the x64 boxes. ®

The Power of One eBook: Top reasons to choose HP BladeSystem

More from The Register

next story
Apple fanbois SCREAM as update BRICKS their Macbook Airs
Ragegasm spills over as firmware upgrade kills machines
Attack of the clones: Oracle's latest Red Hat Linux lookalike arrives
Oracle's Linux boss says Larry's Linux isn't just for Oracle apps anymore
THUD! WD plonks down SIX TERABYTE 'consumer NAS' fatboy
Now that's a LOT of porn or pirated movies. Or, you know, other consumer stuff
EU's top data cops to meet Google, Microsoft et al over 'right to be forgotten'
Plan to hammer out 'coherent' guidelines. Good luck chaps!
US judge: YES, cops or feds so can slurp an ENTIRE Gmail account
Crooks don't have folders labelled 'drug records', opines NY beak
Manic malware Mayhem spreads through Linux, FreeBSD web servers
And how Google could cripple infection rate in a second
FLAPE – the next BIG THING in storage
Find cold data with flash, transmit it from tape
prev story


Designing a Defense for Mobile Applications
Learn about the various considerations for defending mobile applications - from the application architecture itself to the myriad testing technologies.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Reducing security risks from open source software
Follow a few strategies and your organization can gain the full benefits of open source and the cloud without compromising the security of your applications.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Consolidation: the foundation for IT and business transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.