Better late than never: Monster 15-core Xeon chips let loose by Intel
New mission-critical CPUs are mission-critical to Chipzilla's critical money-making mission
Socket two me. Or four. Or eight. Or more.
In addition to stuffing the new Xeon E7 v2 into eight-socket, four-socket and two-socket servers, OEMs can take the E7 v2 series beyond eight sockets by adding their own node controllers – Intel's QPI interconnects can take OEMs to eight sockets gluelessly, but beyond that, OEMs are on their own.
The Xeon E7 v2 series also offers a dual-mode memory scheme enabled by the new JordanCreek memory extension buffer chips. These nifties can support two next-generation scalable memory interconnect channels (SMI Gen2) that handle three DIMMs apiece – up from two in the previous generation – thus supporting up to six DIMMs per channel.
Four JordanCreeks can be tasked to each socket, so we're talking eight DDR3 channels and a hefty 24 DIMMs per socket. To handle all that memory, those channels are a broad eight bytes wide, Esmer said, resulting in up to 2.667 giga-transfers per second (GT/s).
But that's not the nifty part. What's trés cool is the dual-mode nature of JordanCreek's SMI Gen2. It can operate in what Intel calls "Lockstep Mode", in which it operates in a 1:1 ratio with the memory bus, which can crank along at up to 1600MHz. In the new "Performance Mode", however – also known as "Independent Channel Mode" or "Two-to-One Mode" – it'll double up 1333MHz DDR3 to the full 2667MHz we mentioned earlier.
As you might assume, Lockstep Mode is there for when each and every bit, nibble, and byte is precious – think financial transactions – while Performance Mode is there when balls-out performance is the order of the day, and it's the default setting in the BIOS that Intel is providing.
One more advantage of Performance Mode: the 1333MHz DIMMs used in this mode are often cheaper than the 1600MHz DIMMs used in Lockstep Mode – although, as Esmer pointed out, "People who are buying this probably are not shy about populating them" with pricey DIMMs.
So how effective are these SMI Gen2 memory improvements? Esmer – an engineering type not prone to marketing pap – summarized the enhancements thusly: "The bandwidth increase is massive."
When is three more than four?
As we mentioned above, the Xeon E7 v2 processors connect to other sockets over QPI – Intel QuickPath Interconnect. Its Westmere predecessor had four such links, but the new Ivy Bridge parts have but three. "But three is sufficient," Esmer said, "because we have integrated IO."
Socket-to-socket transfers over these three QPI links max out at 8GT/s – a 25 per cent bump up from the 6.4GT/s of the Ivy Bridge Xeon's Westmere predecessor – but that speed can be dialed down for compatibility, Esmer said.
That integrated IO to which Esmer referred comprises 32 PCIe 3.0 (Gen3) lanes and four DMI Gen2 connections per socket. Those 32 PCIe lanes are a step down from the 40 lanes in Intel's E5 v2 line, and she was candid when asked about that drop. "I don't know the reason why we chose 32," she said, noting that some customers, when informed of the drop, had asked for more.
"I don't think that we would have changed if we knew what we know now. There's no technical reason for not having it, basically. We could have – it's just we didn't."
That niggle aside – after all, 32 lanes per socket is plenty for most applications – there's a lot to like about the integrated IO. Latency improvements, for example: in the Westmere part, idle latency was 395 nanoseconds; that been shrunk to 290ns on the new chips, and improvement of 34 per cent.
Overall, Esmer said, integrating the IO on the Xeon E7 v2's die improves aggregate four-socket system IO by a whopping 3.7 to 3.9 times over a Westmere-based four-socket system with dual IO hubs. The Westmere system in her example had a total of 64 PCIe Gen2 lanes and the Ivy Bridge Xeon E7 v2 had 128 Gen3, so any bandwidth increase, while impressive and welcome, is not surprising – although the increase of nearly 4X is.
Sponsored: Benefits from the lessons learned in HPC