AMD draws x64 battle lines with 'Magny-Cours'
Opteron 6100s lock and load
While all of that is interesting, the bigger and perhaps more important change in the move from the Opteron 2000 and 8000 series to the Opteron 4000 and 6000 series is the jump from three to four HyperTransport 3.0 (HT3) links in the point-to-point architecture that defines the Opteron. With the Direct Connect 1.0 architecture that defined prior Opteron machines, processors had integrated memory controllers and implemented a NUMA access method to reach into each others' memory when needed.
The processors had two memory channels per socket and could have eight DDR1 or DDR2 memory DIMMs per socket. The NUMA architecture implemented in the point-to-point interconnect had the processors linked to each other in a square, so a processor could talk directly to its immediate neighbors in a four-socket machine, but to reach its fourth partner in a machine, it had to route through its neighbors to get to that other member of the NUMA cluster. This added latency and slowed down performance.
With the Direct Connect Architecture 2.0 implemented for the Opteron 6100 machines, the processors now have a cross bar switch and all of the four sockets in the box are directly linked to each other, eliminating that extra hop. The sockets now have four memory channels per socket (double earlier machines) and can support a dozen DDR3 DIMMs (up 50 per cent).
This architecture, says Fruehe, is designed to scale up to 16 cores per processor, which is what is necessary to support the future 16-core "Bulldozer" cores in 2011. (See here for more on Bulldozer, which will plug into the G34 sockets used for the Opteron 6100s and the C32 sockets used for the Opteron 4100s).
According to Fruehe, the integrated DDR3 main memory controller on the Opteron 6100s can support up to 1.33 GHz DDR3 DIMMs, delivering up to 42.7 GB/sec of memory bandwidth per G34 socket. That's 2.5 times the memory bandwidth of the Istanbul Opterons. If you want to use low-voltage DDR3 memory modules, then they top-out at 1.07 GHz, which means you get 20 per cent less memory bandwidth, but save somewhere around 10 per cent on the memory power usage for DIMMs.
AMD is supporting 8 GB DIMMs now with the Opteron 6100 systems, which means a two-socket box can support 24 memory slots (192 GB) and a four-socket box can go to 48 slots (384 GB). Don't get too excited about 16 GB DIMMs, but when prices come down out of the stratosphere for these, perhaps late this year or early next, the memory controller on Opteron 6100 is limited such that four twelve-core processors can only address up to 512 GB. To address more memory than this per system will require a move to the Bulldozer cores.
Generally speaking, bin for bin, the twelve-core Magny-Cours chips provide about 88 per cent more integer performance and 119 per cent more floating point performance than the six-core "Istanbul" Opteron 2400 and 8400 chips they replace. These numbers are based on the SPECint_rate2006 and SPECfp_rate2006 benchmark tests, which pitted a six-core Opteron 2435 running at 2.6 GHz against a twelve-core Opteron 6174 running at 2.2 GHz.
Bootnote: The new Opteron 6100s do not have specific instructions for accelerating AES encryption and decryption, as AMD originally told El Reg. The chips, like other Opterons, can run AES software, but that is not the same thing as having specific instructions to accelerate it. The Opterons will get a feature similar to the AES-NI instructions that came out out with the "Westmere-EP" Xeon 5600s two weeks ago when the "Bulldozer" cores ship next year. ®
Sponsored: Benefits from the lessons learned in HPC