Tilera routs Intel, AMD in Facebook bakeoff

Social Memcached marketing

Next gen security for virtualised datacentres

Facebook may not think of itself as a social marketing company, but for upstart server-chip maker Tilera, the social media giant's internal Memcached bakeoff pitting Xeon and Opteron machines against Tilera boxes is a marketing windfall, indeed.

Facebook's Memcached performance paper, being presented at the International Green Computing Conference in Orlando, Florida, details how Facebook tested the mettle (surely metal?) of the current generation of TilePro64 many-cored processors against off-the-shelf servers using Intel Xeon and AMD Opteron processors.

Tilera, SeaMicro, and Calxeda have been waving the microserver banners for Hadoop data munching, Memcached Web caching, and other hyperscale Internet workloads for which having a big, fat, powerful processor core is not always as important as having smart interconnects and core designs when it comes to running these distributed workloads.

SeaMicro, which builds a dense box based on dual-core, 64-bit Atom servers that crams 768 cores into a 10U chassis, recently showed off some Hadoop unstructured data crunching it had done on its machines, and compared it to the work that could be performed on plain-vanilla Xeon boxes. The SM1000 server they tested running a real-world Hadoop workload at a customer site could do it for about 25 per cent less money than a cluster of Xeon servers, in one quarter of the rack space, and burning one quarter of the juice.

Memcached was created in 2003 by Danga Interactive as a distributed Web cache that stores data in main memory and makes it accessible to Web servers and applications. Memcached is what is called a key-value store, and it is now used by Facebook, Twitter, Zynga, YouTube, Reddit, Flickr, and a slew of hyperscale internet companies that need to serve up data to millions of users, and can't wait for disk drives to do the job.

"Facebook is known as the king of Memcached, and they run the most Memcached servers in the world, as far as we know," Ihab Bishara, director of cloud computing applications at Tilera, tells El Reg. "This is a tier-one customer validating the claims that we have been making for the past year and a half."

Bishara is not authorized to talk about Facebook's server plans or if the company has already installed the Tilera servers made by Quanta Computer inside its production infrastructure. Quanta is, of course, the Taiwanese PC and server manufacturer that has just teamed up with Facebook to help it manufacture its homegrown, open source Open Compute servers, which debuted in April with its Prineville, Oregon data center and which will be updated this summer when new Xeon E5 and Opteron 6200 processors are announced by Intel and AMD.

Facebook did its Memcached tests on the Quanta QS2 rack server – also known as the QSSC-X5-2Q – which crams 512 cores across eight processors in a 2U rack-mounted chassis.

Each processor is implemented as a single node, so the Quanta server is really an eight-node microserver. Four of the cores on the 32-bit TilePro64 processor were allocated to run Linux, leaving the other 60 cores to run the Memcached workload. The cores, which are widely believed to be a derivative of the MIPS architecture, ran at 866MHz and have several mesh interconnects to glue together memory and I/O across the cores. (See this story for details on the Tile family of chips.) The TilePro64 server node had 32GB of main memory.

Facebook lined up the Tilera-based Quanta servers against a number of different server configurations making use of Intel's four-core Xeon L5520 running at 2.27GHz and eight-core Opteron 6128 HE processors running at 2GHz. Both of these x64 chips are low-voltage, low power variants. Facebook ran the tests on single-socket 1U rack servers with 32GB and on dual-socket 1U rack servers with 64GB.

All three machines ran CentOS Linux with the 2.6.33 kernel and Memcached 1.2.3h.

There's a lot of very detailed Memcached performance information in the Facebook paper that describes how the TCP and UDP protocols affect performance on these various machines, but this graph is a good snapshot of how the machines stack up:

Facebook Tilera memcached test 1

Memcached performance on Opteron, Xeon, and Tilepro64 servers

As you can see, the capacity in transactions per second is not very good for the x64 servers when it comes to Memcached scalability. On the Opteron machines, for example, going beyond four cores actually hurts performance and adding a second CPU gets you precisely nowhere.

The Xeon chips do a little bit better, but adding the second processor also gets you nothing. It would be better to scale up multiple single-socket Opteron or Xeon nodes – as Quanta is doing with the Tilera chips.

But what it immediately obvious is that – at least compared to low-power, low-core Opterons and Xeons – the TilePro64 with 30 cores can meet or beat what these x64 chips can do. And with 60-cores dedicated to Memcached, the TilePro64 crushes the x64 chips.

Obviously both Intel and AMD have more modern processors than these, and soon will have even newer ones. Tilera has just started sampling its Tile-Gx 3000 series of 64-bit, 36-core chips, which will eventually scale up to 100 cores, too.

Performance is only one component of the system issues with which a company like Facebook is dealing, such as electricity use and thermals (two sides of the same coin), physical size, and cost. Facebook shed some light on the power use in its paper, too. Based on the performance estimates for the machines tested, here is how the machines stack up in terms of electricity consumed:

Facebook Tilera memcached test 2

Performance and power consumption of Tilera and x64 servers

Based on these measurements, Facebook then extrapolated how many nodes it would take to build a 256GB Memcached cluster, and then looked at its performance and power efficiency – and Tilera's chips stomped Intel's and AMD's.

An eight-node Quanta server using the TilePro64 chips could handle 2.68 million TPS and burned 462 watts, delivering 5,801 TPS/watt. A four-node Opteron cluster could deliver 660,000 TPS on the Memcached workload while burning 484 watts, delivering a mere 1,363 TPS per watt. And a four-node Xeon (with 256GB in aggregate as well) delivered more oomph than the Opteron machines at 752,000 TPS, and also consumed less power at 400 watts. But those four Xeon machines could only deliver 1,880 TPS/watt – less than a third of the bang per watt of the TilePro64-based machine.

And to top it all off, the TilePro64 machine only took up 2U of space, compared to 4U for the x64 boxes. ®

5 things you didn’t know about cloud backup

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Oracle reveals 32-core, 10 BEEELLION-transistor SPARC M7
New chip scales to 1024 cores, 8192 threads 64 TB RAM, at speeds over 3.6GHz
Docker kicks KVM's butt in IBM tests
Big Blue finds containers are speedy, but may not have much room to improve
US regulators OK sale of IBM's x86 server biz to Lenovo
Now all that remains is for gov't offices to ban the boxes
Gartner's Special Report: Should you believe the hype?
Enough hot air to carry a balloon to the Moon
Flash could be CHEAPER than SAS DISK? Come off it, NetApp
Stats analysis reckons we'll hit that point in just three years
Dell The Man shrieks: 'We've got a Bitcoin order, we've got a Bitcoin order'
$50k of PowerEdge servers? That'll be 85 coins in digi-dosh
prev story


Endpoint data privacy in the cloud is easier than you think
Innovations in encryption and storage resolve issues of data privacy and key requirements for companies to look for in a solution.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Top 8 considerations to enable and simplify mobility
In this whitepaper learn how to successfully add mobile capabilities simply and cost effectively.
Solving today's distributed Big Data backup challenges
Enable IT efficiency and allow a firm to access and reuse corporate information for competitive advantage, ultimately changing business outcomes.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.