Power7+ chips debut in fat IBM midrange systems

Near the top at first, trickling down to smaller boxes next year

Die shot of the IBM Power7+ processor

IBM has taken the wraps off the first of its Power Systems machinery to make use of its cache-heavy Power7+ processors, and as El Reg anticipated from the hints in the announcement invitation put out two weeks ago, Big Blue is starting near the top of the line as it upgrades systems that run AIX, IBM i (formerly known as OS/400), and Linux.

As has been the case for the past several generations, the rollout for the Power7+ chips will be a gradual one. "The rest of the products will get the Power7+ next year, with the exception of the Power 795," Steve Sibley, director of worldwide product management for IBM's Power Systems division, tells El Reg. "Just like with the Power 595, we already built the fastest processor and I/O into that machine."

It's tough to argue with the guy in charge of the product line – but it's not impossible. Even if IBM can't crank up the clock speed of the 3.7GHz and 4GHz processors used in the high-end, 32-socket Power 795 machine,, the ability to have processors with 2.5 times the L3 cache per core (at 10MB) and better sleep states and Turbo Core modes would no doubt be of use to more than a few Power 795 shops.

If enough customers ask for such a thing, you can bet IBM will sell 'em. Just because Big Blue didn't do it before doesn't mean it can't do it now.

The eight-core Power7+ processor was previewed at the end of August at the Hot Chips 24 chippery fest, and we gave you a peek into its expected performance in the wake of the tech presentation, along with some thoughts on the overclocking potential for the Power7+ chip.

As already divulged, the Power7+ chip will come in one variant that puts a single chip running at a higher clock speed into a socket, and another one that takes two Power7+ chips and crams them into a single socket to double-up the cores, threads, and L3 cache in a socket with what we assume will be a pretty substantial hit to clock speed. IBM calls a regular chip a single-chip module, or SCM, and the double-stuffer a dual-chip module, or DCM.

Die shot of the Power7+ chip from IBM

Die shot of the Power7+ chip from IBM

The Power 770+ and Power 780+ machines announced on Wednesday are based on SCMs, just like the current Power 770 and 780 machines, known as Power7' (Power7 prime) in the internal IBM lingo because these chips were announced in conjunction with a doubling of memory capacity and a shift to PCI-Express 2.0 peripheral slots in these machines in October 2011.

The Power7+ chip has a lot of new features to help accelerate specific functions inside of Power System boxes, including on-chip memory compression, encryption, and hashing algorithms, as well as a random-number generator that cannot be predicted because it is based on random electronic effects on the chip.

The Power7+ chip is implemented in a 32-nanometer process. Specifically, IBM's wafer bakery in East Fishkill, New York, uses a copper/silicon-on insulator process with high-k metal gates to etch the Power7+ chips, which have 2.1 billion transistors on the die.

Like the Power7 chips before it and the System z mainframe processors, the z11 and z12, the Power7+ implements shared L3 cache using embedded DRAM (eDRAM) instead of the faster static RAM (SRAM). It takes fewer transistors to make a memory cell for eDRAM, so even if it is slower than SRAM, you can jam a lot more cache right next to the processors and thereby speed up the performance of the overall processor by more than you might expect.

The shrink from 45 to 32 nanometers allows Big Blue to put 80MB of L3 cache on the die, plus a slew of accelerators. IBM says to make the Power7+ chip using SRAM for the L3 could have pushed the transistor count up to 5.4 billion, and the resulting chip would also be larger and therefore very likely getting lower yields on a new process.

In general, Sibley says that the Power7+ processors will deliver about 20 to 30 per cent more performance in the machines in which it will be soon shipping. But considering all of the accelerators, the expanded cache, and the memory compression for AIX (but not for Linux or IBM i) on the chips, customers would be wise to get some capacity planning help from IBM to figure out how their own applications might benefit as they jump from Power5, Power6, or Power7 chips to the new Power7+ chips in Power 770+ or Power 780+ systems. This is particularly true if you are using software-based encryption on any operating system and memory compression for AIX workloads.

For example, if you're using the AIX memory compression that debuted with AIX 7.1 running on Power7 chips, you could get as much as 2X the usable main memory, but by using the two on-chip accelerators that IBM put on the Power7+ chip to run the proprietary compression algorithm for AIX memory compression, you can get up to 2.25X usable main memory and not have the overhead of running the compression algorithms on the Power7+ cores. You get the double benefit of more addressable main memory (4TB can look like 9TB) as well as lower CPU core overhead, allowing the central processors to do more work.

The accelerators, by the way, are in the uncore area of the Power7+ chips and shared by the cores.

Sponsored: Designing and building an open ITOA architecture