IBM preps Power7+ server chip rev
To see where IBM might be taking the Power7+ and Power8 chips, it makes sense to look at how the chips and memory components have evolved over time. Here's how the latest several generations of Power chips have stacked up:
Here's how the memory systems have stacked up:
I think that IBM wanted to do a 45 nanometer shrink with the Power6+ chips and double up the cores, and either it had trouble doing that or decided against it for economic reasons. And I think that IBM definitely wants to do a 32 nanometer shrink with the Power7+ chips. (The Power4+ and Power5+ chips both had process shrinks and are more representative of the goals IBM and Intel both share with their tick-tocking.) I anticipate that IBM will stay with the same basic Power7 chip design with the Power7+ shrink. That means chips with four, six, and eight cores activated and with the same L1 and L2 caches. The process shrink should allow Big Blue to crank the CPU clocks somewhere between 25 and 30 per cent.
At the same time, perhaps IBM will beef up the on-chip memory controller to support faster and denser DDR3 main memory. The Power7-based Power Systems machines top out at 8GB DDR3 memory sticks running at 1.07GHz, but the controller, in theory, supports 1.33GHz and 1.67GHz speeds and fatter 16GB.
I also think that, given the very substantial performance improvement that the segmented embedded DRAM L3 cache memory on the Power7 chips have – compared to the off-chip L3 caches with prior Power chips – Big Blue will boost the eDRAM capacity on the chip.
It seems unlikely that IBM would boost the core count with the Power7+ chips, but with the shrink, the company could do what it did in the Power5+ generation – and what Advanced Micro Devices is doing with its Opteron chips: take two shrinking processors, gear them down, and cram them into one processor socket. This would allow IBM to put a lot of cores and threads into the same Power System machines.
The Power7+ I/O enhancements could be tweaks to the memory controllers as well as to the integrated GX bus on the Power processors, which implements a double data rate (DDR) InfiniBand link from the chips out to remote I/O drawers. DDR InfiniBand runs at 20GB/sec and is a bit long in the tooth compared to 40GB/sec QDR InfiniBand – which has been out for years – and 56GB/sec FDR InfiniBand, which will be coming to market later this year.
If I were IBM, I would push up to at least QDR, doubling up the I/O bandwidth coming in off peripherals into the chips. QDR will be necessary for sure if IBM wants to support PCI-Express 3.0 peripheral slots, which will have a total bandwidth of 32GB/sec bi-directionally for an x16 slot. InfiniBand is moving from 8b/10b encoding (where you send 10 bits for every 8 bits of data) to 64b/66b encoding (that's 66 bits for every 64 bits of data). The PCI-Express 3.0 bus is also ditching 8b/10b encoding, and is shifting to an even more efficient 128b/130b encoding scheme. The first PCI-Express controller chips are expected by the end of this year, so if I had to guess, I would say IBM is moving the GX bus and related 12X remote I/O links to QDR or FDR InfiniBand and the peripheral bus to PCI-Express 3.0. (Presumably it will be called 24X or 33.6X I/O.)
There are rumors that IBM is working on some deep-sleep, low-power state modes for the Power7+ processors. I also think IBM should bring the MaxCore/TurboCore functionality across the entire Power7+ range, allowing customers to run all the cores at a rated speed (MaxCore) or run them somewhat faster if they turn half the cores off (TurboCore). It would be useful if the TurboCore mode could be initiated on the fly, not requiring a reboot of the system. Right now, this feature is only available on the Power 780 and 795 high-end machines.
Some of these things that I speculate about Power7+ could, of course, end up in Power8 chips. About the only thing that it is safe to guess on with Power8 is that it will use a 32 nanometer or 28 nanometer process and will come out sometime in 2013. ®
Sponsored: Benefits from the lessons learned in HPC