Appro: HPC's all about the GPUs
Magny-Cours as AMD 'comeback kid'
Big Blue competition
According to roadmaps that El Reg has seen, IBM is supposed to be cooking up a QS2Z blade, which is supposed to have two Cell chips that in turn have two Power cores and a whopping 32 vector processors each, using a next-generation memory and interconnection technology. This QS2Z blade would sport 2 teraflops per blade at single precision and 1 teraflops per blade at double precision.
This blade could, in theory, compete with the Fermi GPUs. But probably not at anything close to the same price. Which is probably why Oak Ridge went with Fermi GPUs. (It is not clear when or if this QS2Z blade from IBM will come to market, but it was supposed to be in the first half of 2010).
Even with its substantial lead in GPU co-processing, the physical form factor of Nvidia's GPUs is still going to present HPC vendors with one challenge: integrating the GPUs into their servers. It's not like server motherboards have multiple GPU sockets that allow them to be snapped right into the system board. They are still linking in through PCI-Express ports, and they are still hot and cannot be densely packed into clusters.
The other thing that Appro will be previewing at SC09 in Portland this week is its future Opteron server lineup, and Lee is calling the "Magny-Cours" twelve-core Opterons due in the first quarter of 2010 AMD's "comeback play." Appro will do a complete product refresh of its Opteron super line with the Magny-Cours chips, now called the Opteron 6100s and their G34 socket. "For many HPC codes, customers will still need a general purpose CPU," explains Lee. "It has been a tough year for AMD, but with the G34 processors, we think it will start to come back."
AMD started touting the impressive memory bandwidth of the Opteron 6100-G34 systems last week, showing that four-socket box will be able to deliver around 100 GB/sec of memory bandwidth on the Stream benchmark test.
At the moment, Appro doesn't have much use of the Opteron 4100s and their C32 sockets, which are a variant of the current Rev F 1,207-pin sockets, as AMD divulged last week. This low-end, low-power Opteron 4100 chip, which comes with six-core and eight-core processors, could be used as a host for multiple Fermi GPUs, Lee concedes. "If you are buying a GPU system, a low-cost host makes sense." But for real HPC work, Lee says the Opteron 4100s don't have enough cores or memory bandwidth to be practical. ®