Intel sneak peeks Westmere EP server silicon
Four and six cores, Turbo Boost, AES
With IBM and Intel gearing up the high-end Power7 and "Tukwila" Itanium launches for next Monday, Intel's preview of its "Westmere-EP" processors for servers and workstations and a slew of research projects was always going to get lost in the shuffle.
The preview is set for Monday as the International Solid State Circuits Conference kicks off in San Francisco. But in an apparent effort to prevent the preview from being lost amidst all the talk of the Power7, Intel gave the press a preview pre-brief this morning, showing off the papers it plans to present at the conference.
Chip makers will be chip makers.
Separate from the conference, Intel is also launching the long-overdue quad-core Tukwila Itanium. But that is a separate story, which you can read all about here .
The Westmere-EP chips are kickers to last year's quad-core Nehalem-EP Xeon 5500 processors, which were launched at the end of March 2009  and which have very much helped keep the server business staggering along, somewhat bewildered but not falling completely down, throughout last year. With a shrink to the second generation of Intel's high-k metal gate 32 nanometer wafer baking processes from the 45 nanometer tech used to make the Xeon 5500s, it wasn't hard to guess that Intel would be adding some more cores to or cranking the clocks on the Westmere-EPs.
As it turns out, and as you no doubt figured out because there is a speed limit on clock speeds enforced by the Thermal Police these days, Intel is going to be adding more cores and more on chip cache to the Westmere-EP chips.
Specifically, the Westmere-EP is using the extra transistor budget that the slide from 45 to 32 nanometer processes allows to add two more cores to the processor and to boost the on-chip L3 cache by 50 per cent to 12 MB per chip. Nasser Kurd, senior principle engineer at Intel's Architecture Group confirmed to El Reg that Intel will deliver four-core variants of these chips. Nasser also confirmed that the Westmere-EP chips will support Turbo Boost, which allows for the clock speed of the processor cores to be jacked up a bit as other elements of the chip are quiesced.
Generally speaking, the Westmere-EPs will have the same clock speed range and the same thermal envelopes as the existing Xeon 5500s, but Intel is has not yet announced specific SKUs and won't until the middle of March or so when these new chips are formally launched. The Westmere-EPs will plug into the same sockets and use the same chipsets and DDR3 main memory as the Xeon 5500s and have three memory channels per socket like them as well.
The 32 nanometer, six core Westmere-EP chip
The six-core Westmere-EP chip has 1.17 billion transistors and is 240 square millimeters in size. As you can see from the pretty picture above, it is implemented in two halves of three cores each. The core regions have their own clock speed and power supply, and with the tweaks to the Westmere design the L3 cache and memory controller regions - what Intel calls the "uncore" areas - get their own, separate power gating.
Uncore power gating
With the Nehalem family of chips, Intel was able to power gate the transistors in each core to shut that core down when it wasn't used. The core state was saved in the on chip cache and the uncore region kept running at full power. With the Westmere family, there is power gating for each core, but now the uncore is also gated.
The two-core Westmere mobile chips also have a dedicated and power-sipping static RAM on the chip saves the state of the cores so on chip caches can be powered down when not in use. (Why the server variants of Westmere do not also have this SRAM state cache is unclear, but apparently it does not).
The Westmere-EP chips implement Intel's HyperThreading variant of simultaneous multithreading, which gives each core two virtual threads to present to the operating system or hypervisor running atop the chip. The Westmere chips also have new cryptographic instructions that implement the Advanced Encryption Standard (AES) algorithm for encrypting and decrypting data.
Another new twist with the Westmere-EPs is that the memory controllers embedded on the chips can support low-voltage DDR3 main memory, which runs at 1.35 volts as well as standard DDR3 memory, which runs at 1.5 volts. The net effect of this change is that memory DIMMs run about 20 percent cooler when using the low voltage parts without sacrificing performance.
The Westmere-EP chips used in servers will very likely be called the Xeon 5600s when they start shipping.
Another system-related paper that Intel will be presenting next week at ISSCC that looks like it might have immediate and practical benefits for high-throughput systems is a new kind of chip-to-chip interconnect that looks like it beats the pants off of QuickPath Interconnect, the processor and memory linkage scheme that Intel debuted with the Nehalem chips last year. This experimental interconnect, which was not given a name, has about ten times the power efficiency of moving data from chip to chip than the current scheme.
According to Randy Mooney, an Intel Fellow and director of I/O research at Intel Labs, the traditional interconnect (like QPI) has to go from a chip, down through the package, out over the motherboard and back up through the socket and package to reach the cores on the other side of the mobo.
Using QPI , moving a terabyte of data between chips in different sockets might take 150 watts of juice, but the direct link - which is bolted on top of the chip package and links the chips more or less directly to each other - was able to move a terabyte of data between the chips only burning 11 watts.
Perhaps more significantly, when this interconnect drops into sleep mode, it only burns 7 per cent of the juice it needs when it is running, and it can wake up from the sleep state 1,000 times faster than QPI does. ®