Original URL: http://www.theregister.co.uk/2013/04/09/hp_moonshot_server_analysis/

'Til heftier engines come aboard, HP Moonshot only about clouds

And those engines will come – as will FPGAs, DSPs, GPUs ...

By Timothy Prickett Morgan

Posted in Cloud, 9th April 2013 06:04 GMT

Analysis The HP Moonshot hyperscale servers are not even fully launched, and Intel and Calxeda are already bickering about whose server node is going to be bigger and better when they both ship improved processors for the Moonshot chassis later this year. Other engines will be coming for the Moonshot machines, too, HP execs tell El Reg, and they will be sorely needed if the Moonshot boxes are to do real work across a wider range of software.

With the fairly limited performance of the dual-core "Centerton" Atom S1200 processors that were used in the initial "Gemini" server nodes announced on Monday, the machines are at this point relegated to dedicated hosting for very small server workloads and for modest web application serving.

HP may be running a portion of its hp.com website, which gets 3 million hits a day, on the Moonshot Atom S1200 iron, and it may be only burning 720 watts doing so, but this is a fairly tiny portion of the entire HP web site.

It is going to take more powerful processors to do the heavy lifting of an ecommerce site or to run back-end applications for HP's own business. Or those of any other business, which is really the point. This is all about HP trying to get companies to buy its hyperscale servers, rather than build their own or go to Open Compute Project designs.

El Reg has no doubt that a single rack of Moonshot machines, which comes in at 47U because the Moonshot 1500 chassis is a non-standard 4.3U high, can do the webby or hosting work of eight racks of 1U rack servers with two Xeon or Opteron processors.

And when you do the math – assuming what we presume is pretty poor but nonetheless typical utilization on those two-socket x86 boxes – a rack of the Moonshot servers based on the Atom S1200 processors uses 89 per cent less energy and 80 per cent less space, at a cost per node that is 77 per cent lower.

But again, that is for a pretty precise and not particularly heavy workload. No one is going to build a Hadoop cluster on the current Moonshot designs – at least not one that is more than a science project.

The day will come, however, when HP has the right engines to run heavier workloads. Because as Jim Ganthier, general manager of Industry Standard Servers and Software at HP, explained to El Reg, HP thinks it can add server cartridges, switching modules, or storage cartridges to the Moonshot boxes at an accelerated pace, compared to the 18 to 24 month cadence of its ProLiant rack, BladeSystem blade, and SL6500 scalable systems machines. We're talking about new cartridges on a 4 to 8 month cadence, or about three times faster than what we are used to these days in X86 Server Land.

"You can come out with something at the speed of need," as Ganthier put it.

Part of that speedup that Ganthier is talking about is an illusion that comes from having more than one or two processor suppliers, as is the case with HP's servers these days. When you broaden the compute engines to include various ARM processors as well as digital signal processors, field programmable gate arrays, GPU coprocessors, and hybrid CPU-GPU chips, it is no wonder that the pace of innovation has to pick up.

Let's take a look under the Moonshot hood

It is not clear why HP needed that extra bit of space that pushed it into an oddball server chassis size and therefore a non-standard rack size, but it is far more likely that HP figured out it could get away with 47U racks and worked backwards to come up with a server cartridge and chassis spec that provided the maximum density of wimpy compute nodes.

Top view of the Moonshot 1500 chassis

Top view of the Moonshot 1500 chassis

The original, first-generation "Redstone" Moonshot machines from November 2011 were based on the 4U SL6500 chassis, which had four server trays. Using the 32-bit Calxeda quad-core ECX-1000 processors, each of the 18 server cards in a single tray could host four processors, each with four SATA ports and one memory stick. That gave you 288 server nodes in a 4U space, and it included the distributed Layer 2 switch to link the nodes together.

However, that did not include any storage, and if you wanted local storage on the nodes, you had to buy disk cards that slotted into the PCI-Express slots that made up the Redstone backplane. So call it 144 nodes in 4U, or 36 servers per rack unit.

With the Moonshot 1500 chassis, the backplane slides into the bottom of the chassis from the front and snaps into the dual 1,200 watt power supplies in the back of the chassis. The five dual-rotor, hot-plug fans that cool the server nodes are in the back of the chassis. The chassis includes a chassis management module, which has a subset of the Integrated Lights-Out (ILO) server management controller used in ProLiant and BladeSystem machines.

A whole lotta options

The Moonshot 1500 chassis has two switch modules that run the length of the backplane, back to front. These are redundant Ethernet switches based on Broadcom Trident+ ASICs, so if you think HP cut a special deal with Intel for its Fulcrum Ethernet ASICs as well as for its Centerton Atoms (as I was expecting it to), you are wrong.

But that day will probably come. Gerald Kleyn, director of hyperscale server hardware R&D at HP, tells El Reg that the company is prepared to put all different kinds of switch fabrics at the heart of the Moonshot chassis – because it expects that as the computing in the chassis evolves, the networking needs will, too.

That could mean other Ethernet switch modules, or perhaps InfiniBand switches, if there is a demand for it. Kleyn is making no promises, except he says that HP is keeping an open mind and will add the networking features that customers require.

HP Moonshot switch module

HP Moonshot switch module

The two Moonshot 45G Ethernet switch modules – known as A and B – are meant to be redundant for high availability and for load balancing, and together they provide an aggregate of 3.6Tb/sec of bandwidth. These Ethernet switch modules run at Gigabit Ethernet speeds, and each server cartridge can do four links to each of the two switches.

The chassis actually implements a bunch of different networks across the backplane, the switch modules, and the PCI-Express slots that are used to link the server cartridges to the chassis and to each other. The backplane also implements a 2D torus interconnect for linking server nodes in groups of three for so-called "north-south" direct links for an n-tier server architecture setup.

You can also use the 2D torus mesh implemented in the backplane to glue up to fifteen server nodes in an "east-west" configuration when the servers are sitting side-by-side and communicating (like in an infrastructure cloud, for instance, with virtual machines flitting around using live migration). This node-to-node 2D torus mesh offers 7.2 Tb/sec of bandwidth, which means you can link machines together with lots of oomph without having to rely on those Ethernet switches.

The backplane also implements a storage interconnect for linking to storage cartridges. These storage nodes, which are not yet available, will house two 2.5-inch disk drives in either 500GB or 1TB capacities; later this year, HP will make available a 200GB SATA flash drive option. (Why all of this is not available now is a mystery, and you have to figure that there are still some issues in the storage interconnect, which is derived from HP's SmartStorage disk controllers.)

Some of the pins on the PCI-Express slots in the Moonshot chassis are used to provide power to the server nodes, just as is done with the SeaMicro servers from Advanced Micro Devices and the Redstone servers based on the Calxeda ECX-1000 chips. There is also a separate management network to control the servers and storage, so you are not trying to ship application data and management data over the same network.

The Gemini Atom S1200 server node that is the first compute element to ship in the Moonshot servers is based on a dual-core, four-thread Atom S1260 processor running at 2GHz. It has 64KB of L1 data and 64KB of L1 instruction cache per core, plus 1MB of L2 cache memory shared across the cores.

The chip – technically a system on a chip – has a PCI-Express 2.0 controller and a DDR3 main memory controller. The memory, which is on a SO-DIMM on the side opposite the disk drive (which HP has not shown in any of its pictures), runs at 1.33GHz, has 8GB of capacity, and has ECC scrubbing, which is required by server operating systems.

The S1200 has VT electronics on the chip, as well, so it can run modern server virtualization hypervisors, too. The Gemini Atom server cartridge has a dual-port Broadcom 5270 Gigabit Ethernet port and a Marvell 9125 storage controller.

The Gemini Atom S1200 server node leaning on a soccer ball

The Gemini Atom S1200 server node leaning on a soccer ball

At the moment Canonical's Ubuntu Server 12.04, Red Hat's Enterprise Linux 6.4, and SUSE Linux Enterprise Server 11 SP2 are supported on the Atom S1200 nodes. Support for Windows Server 2012 is expected in a few months, says Kleyn, but technically speaking, Windows Server 2012 will boot up on the node today. Support just means you can get tech support from Microsoft and HP when something goes wrong.

What's next for Moonshot?

There's nothing wrong with the Centerton Atom server cartridges announced with the Moonshot Gemini design on Monday, but they have only one processor per cartridge, which is not all that impressive considering how small an Atom chip is. The server nodes snap in from the top, just like the Ethernet switch modules do, and they are hot pluggable. But again, with 45 servers in a 4.3U space, you are only getting 10.5 servers per rack unit.

During the HP Moonshot webcast, Calxeda showed off a Gemini server cartridge that sported four Calxeda processors, which will get up to 180 server nodes in a 4.3U chassis, or just under 42 nodes per rack unit.

The forthcoming quad-node Gemini server from Calxeda

The forthcoming quad-node Gemini server from Calxeda

That is a little bit better than was possible with a mix of Calxeda processing and storage nodes in the original Redstone Moonshot machines. (It's about 17 per cent more computing per rack unit, if you do the math and assume a healthy mix of compute and storage nodes in the Redstone setup.)

Karl Freund, vice president of marketing at Calxeda, tells El Reg in an email exchange that the server card shown above is running in its labs at 1.4GHz and will offer higher performance per node than the Atom S1260 node that was part of the launch today.

"While Intel is first with Atom, rest assured the ARMy is right behind them in the queue," writes Freund. "And really not far enough behind to matter – one to two quarters at most."

It was not clear from the presentation or the exchange with Freund whether this card is using the current 32-bit ECX-1000 with four Cortex-A9 cores, or the "Midway" quad-core chip that Calxeda is expected to deliver later this year with 40-bit memory addressing and support for hardware-based virtualization using Cortex-A15 cores.

The Midway chip is expected to deliver about 50 per cent more integer performance, about twice as much floating point performance, and four times the main memory as the ECX-1000 chip. Calxeda will not get its 64-bit "Lago" ARM SoC into the field until 2014, which is when other vendors are expected to get theirs into the field, too.

Intel doesn't want to leave anyone with the impression that it is resting on its Atom laurels. Raejeanne Skillern, director of cloud marketing at Intel, put out a blog in the wake of the Moonshot Gemini server launch reminding everyone that the "Avoton" Atom S Series processor is coming later this year.

That design will be based on the new "Silvermont" microarchitecture for Atom and will also use the 22 nanometer TriGate wafer-baking processes from Intel's fabs to boost both performance and performance-per-watt significantly. Skillern also confirmed in her post that Intel and HP will be able to cram four of the Avoton processors onto a single Gemini server cartridge, yielding the same socket density as the future Calxeda node.

The Avoton chip is sampling now and is expected to be available in systems in the second half of this year. And interestingly, it will have an "integrated Ethernet fabric controller" – what you and I would call a network interface port – on the SoC.

While these are interesting and better options in both cases, what Moonshot will probably need are brawnier x86 and ARM processors for heavier workloads, and perhaps to pair them with FPGAs, DSPs, and GPUs. Kleyn made no promises, but said that the Moonshot architecture would certainly allow for double-wide and triple-wide server nodes and could, in theory, support two-socket x86 server nodes for larger workloads.

HP has made no promises to do this, of course. But a "Haswell" Xeon E5 seems to be a no-brainer for a single-socket node with more CPU oomph and memory, and a low-powered two-socket Gemini server node with a larger memory footprint using future "Ivy Bridge" Xeon E5 chips would probably come in handy, too. You can global replace Opteron 4400 and 6400 in there, as well. ®