Original URL: http://www.theregister.co.uk/2012/06/11/hot_chips_24_preview/

Intel rivals crash Hot Chips party with full-fat server silicon

Hey baby, wanna come upstairs and see my pipeline?

By Timothy Prickett Morgan

Posted in Hardware, 11th June 2012 10:18 GMT

After hogging most of the oxygen in the server market with its new Xeon E3 and E5 processors this spring, Intel is going to get a little competition this summer as its rivals in the server racket trot out their upcoming server processors at the Hot Chips 24 symposium at Stanford University.

The Hot Chips event, which is hosted jointly by the IEEE and the ACM, runs from August 27 to 29 in Cupertino, and is one of two big events held each year when makers of CPUs for PCs, servers, and mobile devices, FPGAs, network processors as well as the latest chip manufacturing techniques such as die stacking and etching processes are trotted out for bragging rights. (The other event, of course, is the International Solid-State Circuits Conference hosted by the IEEE early in the year.)

This particular Hot Chips looks to be an interesting one, as a slew of new server and PC processors and possible adjuncts will be talked about in detail for the first time.

Wednesday is the big day for server chips, with IBM divulging the details of its Power7+ processor for its Power Systems line of servers. IBM has not said much about what the Power7+ chip will include, but as El Reg has previously reported, we know that IBM is planning to implement the Power7+ in a 32 nanometer process, giving it a fairly large shrink compared to the current 45 nanometer Power7 chips, which have eight cores and 32MB of on-chip embedded DRAM L3 cache memory.

El Reg has been able to get its hands on a few roadmaps, but all they show is that Power7+ will have higher clock speeds, a "very large cache", and unnamed accelerators on the chips. The Power7 chips first started coming to market in midrange boxes in early 2010, and are due for a refresh to make them more competitive with the latest Xeon and Opteron processors from Intel and Advanced Micro Devices.

IBM is also showing off what it calls its third-generation zNext processor for mainframes, which presumably means it is talking about the z12 kicker to the current z11 processors used in the System zEnterprise 114 and 196 mainframes.

These came out in July 2010, and they are also getting a little long in the tooth and are due for an upgrade. IBM is not saying much about what its plans are for zNext v3, but the current z11 chip has four cores on a die, is implemented in 45 nanometers, and runs at a top speed of 5.2GHz.

It seems likely that IBM will goose the clock speed a bit as part of a move to 32 nanometers with its chip etching processes on the z12 chips and will also add more L2 and L3 cache on the die. If IBM can boost the L3 cache enough, it may be able to do away with the off-chip L4 cache and controller and thereby simplify the components that go into its mainframes.

Oracle and Fujitsu are also rolling out their latest Sparc processors for servers at the Hot Chips 24 event. Oracle will be showing off its Sparc T5 processor, which will sport 16 cores and will have the necessary circuitry on those chips to glue up to eight of them together in a single system image in a NUMA configuration with one hop between processors. To get a one-hop NUMA connect with four sockets, you need three NUMA ports per chip, and this is something that you can do with the current Opteron 6200 and Xeon 4600 processors.

To do a one-hop link between eight sockets, you need seven ports coming out of each chip and a total of 56 ports, or you need some kind of funky crossbar switch that sits in the middle of all of the sockets – allowing each socket to hit the switch and link directly to another socket in the cluster. Oracle says that it is glueless, meaning it doesn't need any external chipsets to link the Sparc T5s together into a single system image.

Sometime partner and sometime rival Fujitsu appears to have at last committed itself to putting its future 16-core Sparc64-X processors into Unix machines, and will be showing this chip off at Hot Chips. Fujitsu is already shipping a 16-core Sparc64-IXfx processor in its PrimeHPC supercomputers, which are based on the same design as the K supercomputer, which uses eight-core Sparc64-VIIIfx processors.

Neither Oracle nor Fujitsu used a variant of the eight-core Sparc64-VIIIs in the Sparc Enterprise M machines, which seems a bit odd at first; they merely continued to sell the quad-core Sparc64-VII+ processors (called the M3 chips by Oracle). It seems likely that this choice was made because to get to 8 and 16 cores, Fujitsu had to radically gear down the clock speeds – from 3GHz with the Sparc64-VII+ down to 2GHz with the Sparc64-VIIIfx and down to 1.85GHz with the Sparc64-IXfx.

Drilling into Oracle's performance boasts

In any event, Oracle's Sparc processor roadmap from September 2011 said that the Sparc M4 chips were in test and would be used in machines spanning up to 64 sockets, just like the current Enterprise M9000 machines. Oracle is promising that the Sparc M4 processors would have 1.5 times the single-thread performance and six times the throughput performance of the M3s they replace, which has a lot of people scratching their heads.

Oracle's Solaris software gurus told El Reg last November that the M4s were based on a T series core, not a Fujitsu Sparc64 core, and would therefore be able to support Oracle's Logical Domain (LDom) hypervisor, which the Fujitsu Sparc64 chips cannot. This piece of data, if it turns out to be true, complicates the situation a little. It may simply not be true.

Just on raw oomph alone, if the Sparc64-X chip from Fujitsu was ramped up to 4.5GHz, which would certainly be possible with a 32 nanometer process if that is what Fujitsu is using, and kept two threads per core as was used in the Sparc64-VII+ chips, it fits the comparison between the M3 and M4 chips perfectly.

For the M3, you get a total of eight threads running at 3GHz, or a combined 24GHz of aggregate clocks, and for the M4 you get a total of 32 threads running at 4.5GHz, for a combined 144GHz of aggregate clocks. It could be, of course, that Oracle plans to take an 8-core or 16-core Sparc T5 chip, wrap some big SMP electronics and L3 cache around it, and crank the clocks up to the same range to get the same effect. This would effectively cut Fujitsu out of the high-end Oracle server business and make the two competitors, not partners.

You'll notice that Oracle is not showing off a Sparc M4 processor at Hot Chips. El Reg would guess that the Solaris guys got it wrong on what kind of core is in the M4 chip but have it right in that the Sparc M4, now called the Sparc64-X, does have the features on chip to support LDom portioning, and that adding this partitioning capability is one of the things that has taken so long to develop and made the Sparc64-VII+ look so long in the tooth.

Also on the server front, Intel will be showing off its "Sandy Bridge-EP" Xeon E5-2600 processor for two-socket boxes, which was announced back in March, and Applied Micro Circuits will also be showing off its 64-bit ARMv8-based X-Gene server processor, which the company divulged was in the works last fall and which could become a weapon in the upcoming x86-ARM server wars.

AMD will be talking about its "Jaguar" microprocessor, and Intel will be talking about its "Ivy Bridge" Core v3 and "Medfield" Atom Z2460 processors, which are used in PCs or mobile devices. It is interesting that AMD is not talking about the future server chips it has cooking based on its "Piledriver" cores and which will be compatible with the C32 and G34 sockets that its Opteron 4200 and 6200 chips plug into.

Techies from the University of Michigan will be showing off the Swizzle Switch, "a self-arbitrating high radix crossbar" for network-on-chip devices (PDF), which they think are better than mesh or flattened butterfly (FBFly) interconnects. They will also be trotting out the experimental Cenitp3de 3D stacked ARM chip complex that the university also presented at ISSCC earlier this year.

AMD will also be trotting out its "Trinity" Fusion APUs and HD7970 graphics processors, along with Intel providing some more specifics about its x64-based "Knights Corner" Many Integrated Core (MIC) coprocessors for supercomputing applications. Intel will also be gussying up its "Claremont" near threshold voltage 32-bit concept chip, which was also making the rounds at the International Solid-State Circuits Conference earlier this year. ®