Applied Micro leaps ahead in ARM server race
ARMed and extremely dangerous – to Intel, AMD
Applied Micro Circuits, a company known for networking chips and for dabbling a bit in embedded PowerPC processors, has aimed a haymaker of an ARM server chip right at the cloudy jaws of Intel and AMD.
What's more, the specs divulged by Applied Micro – if all works according to plan – suggest that the x86 chip makers might end up missing more than a few teeth. They could end up with broken jaws, as well.
Applied Micro's forthcoming X-Gene processor is based on the 64-bit ARM v8 architecture  announced by ARM Holdings this week. Applied Micro has been working with ARM Holdings for the past three years to not only come up with an ARM chip suitable for modern, cloudy servers, but to make sure that Applied Micro is the first out the door with such a 64-bit chip.
That would put them ahead of Marvell, which bought Intel's Xscale ARM RISC processor business in 2006 and has server aspirations , and upstart Calxeda, which is expected to launch its own server-tuned ARM chip  next week – and, if rumors are correct, with HP being the first to put together a server based on those chips.
Intel may not only eventually rue the day it sold off its Xscale biz to Marvell, but also the past few years when the chip giant insisted that there was nothing that ARM could do that Intel could not match with its x86 instruction set and its obvious edge in wafer baking tech, which is no doubt the envy of the world.
Alternatively, if Chipzilla can redesign the Atom – much as the Xeons were redesigned to give us relatively power efficient laptop chips scaled up for servers – Intel still has a chance to repel the ARM attack. It's tough to call those odds right now: Intel and its AMD sidekick versus the ARM legion.
What can be said, however, is that the server racket is spoiling for a chip fight like we haven't seen in many a year.
Paramesh Gopi, president and CEO of Applied Micro, is undaunted about attacking Intel in the server racket. He believes that the hyperscale server market is wide open because of the operating costs and inefficiencies of x86 servers.
Applied Micro CEO Paramesh
Gopi banging on about x86
"We looked at this, and we wanted to figure out what would make a huge dent in this infrastructure," explained Gopi at the X-Gene chip launch event in Santa Clara on Thursday, where ARM Holdings is hosting its ARM TechCon 2011.
Cheeky, ain't that, holding the event in Intel's back yard?
"We talked to customers starting three years ago," he said, "and they were whispering, and the whispers turned into a shout, and the shout turned into a banging on the desk, as we are on the cusp of something beautiful and phenomenal. We have a way to fundamentally change the TCO, the sustainability coefficient, in scaling this infrastructure."
And that, says Gopi, means rethinking what a server is. And it also means turning ARM chips into real server processors. "No wimpy fabric. Zero compromise. Do not start with the baggage. Clean slate. And at 3GHz from the get-go. No wimpy cores. This is not a wimpy computer. This is an ARM on steroids."
Looks like Calxeda and Marvell have a little competition.
The Applied Micro Circuits
X-Gene ARM server chip
The Applied Micro X-Gene is truly a system-on-a-chip; it has everything a server will need except main memory and external storage. The ARMv8 core at the heart of the chip supports quad-issue out-of-order execution, something all "non-wimpy" cores can do – Oracle finally added OOO to the Sparc T4 chip  announced last month.
Applied Micro is not talking about how many cores will be on each chip, but it looks like it will first appear with two, especially considering all the other stuff that Applied Micro is cramming onto the SoC. The cores will have L1 and L2 caches per core, a shared L3 cache that spans the cores, and have a target clock speed of 3GHz.
The X-Gene chip will also include DDR3 main memory controllers, two 10 Gigabit Ethernet ports, SATA storage and PCI-Express peripheral controllers, and a power/management module – all on the same die as the cores.
Sips juice like an iPad
The X-Gene design is aimed to make a server more like an iPad, with idle power of under 500 milliwatts per core and sleep mode of 300 milliwatts; active power is expected to be around 2 watts per core.
The chip will also have dynamic frequency scaling – what is generally called a turbo-boost mode – that allows the clock speed of the chip to be goosed when a job needs faster single-thread performance and there is enough spare thermal capacity to allow a core to run hot for a bit. Moreover, server vendors who adopt the X-Gene chip will be able to set the thermal design point and clock speeds in the chips to meet performance requirements and make the heat tradeoffs – and lock those TDP settings in.
But wait, there's more. The X-Gene has a fully non-blocking, 1Tb/sec interconnect, which can be used to feed data between multiple X-Gene sockets at 100Gb/sec speeds, and that provides nearly 80Gb/sec of aggregate bandwidth.
Here's the interesting bit: this non-blocking fabric allows an X-Gene server to scale from 2 to 128 cores in a fully cache coherent server image. So not only can the X-Gene be set up to be a baby symmetric multiprocessing server inside of a single chip, that SMP can be extended across multiple X-Gene chips – and in a glueless fashion that does not require extra chips.
The X-Gene chip also has on-chip CPU and I/O virtualization, just like x86, Sparc, Power, and Itanium chips do. The architecture also allows for various kinds of offload engines to be plugged in and perhaps integrated on the chip package.
The chip is not yet ready, but Applied Micro has cooked up a board that simulates 128 cores and all the I/O features, and that can run Fedora or Ubuntu Linux; it's powered by a bunch of Vertex FPGAs, and is show below:
The X-Gene simulation boards will be ready for partners in January 2012, and Applied Micro expects to have early silicon available for the real X-Gene chip in the second half of 2012 – that's anywhere from a year and a half to two years ahead of when ARM Holdings expects for prototypes of the ARMv8 to appear in systems based on its own reference designs. Applied Micro is smart enough to know that the server industry can't wait that long for a 64-bit chip, even if smartphones and tablets can.
Applied Micro has tapped Taiwan Semiconductor Manufacturing Corp as its foundry for the X-Gene chips, and will first etch the chips in its 40 nanometer processes, then follow-up with kickers using TSMC's 28nm processes. It seems likely that at the 28 nanometer node Applied Micro will significantly boost the X-Gene core count, maybe to four cores and maybe as many as six or eight. ®