Feeds

Power7 v Power6 - it's all about the cache

Double the thread count

Beginner's guide to SSL certificates

IBM is launching the first of its Power7-based systems today, and the company thinks that the innovations inside the Power7 processor are going to give it a leg-up on the competition in terms of capacity, throughput, and energy-efficiency. But how do those Power7 processors stack up to the existing Power6 and Power6+ processors used in the Power Systems lineup?

IBM says that with the Power7 design, it has the right balance of cores, threads, and clock speeds - and the interplay between them - to tackle the kinds of workloads that like multithreading (such as Java applications and middleware) and those that like clock speeds and cache memory and that need to move data through the system quickly (such as transaction processing and analytics).

In a way, the Power7 design is a back-step from the Power6 and Power6+ chips that precede it. IBM has been shipping dual-core, 64-bit Power processors since the Power4 launched in October 2001, and it stuck with dual-core chips as it transitioned from the 180 nanometer processes used in the Power4 chips through the 130 nanometer wafer baking for the Power4+ in November 2002 (adding copper and silicon-on insulator technologies) and the Power5 in May 2004, on down to the 90 nanometer tech used in the Power5+, through the 65 nanometer processes for the Power6 chip in July 2007.

Those same 65 nanometer processes were used to make the Power6+ chips that were put in the midrange Power Systems machines in October 2008 and in entry boxes and blade servers in April 2009. IBM did not, for whatever reason, do a process shrink with the Power6+ chips, but rather added a few instructions to the chip and decided to ride out 2009, waiting for Power7 chips and their radically different eight-core design, to come to market.

With Intel's quad-core "Tukwila" Itanium running late and Sun Microsystems' 16-core "Rock" UltraSparc-RK chip barely on life support (it was canceled in June 2009), IBM could stand pat with the Power6+ chip and just double up the sockets on system boards to add more oomph to the boxes.

The expectation, of course, was that the Power6+ chip itself would offer about twice the oomph of Power6, and while IBM was pretty vague in its roadmaps, the chips topped out at 5 GHz and never came close to their 6 GHz design point and IBM did not add more cores to the Power6+ die, as it was expected to. Chip happens.

Every chip maker misses roadmaps and hurts their business. Lucky for IBM, the delays with Tukwila and Rock hurt worse than the ridiculously small performance gains from the hop from Power5 to Power5+ in October 2005, and doubling up the socket count (rather than the core count) with Power6+ was a good enough stop loss maneuver. At least based on the fact that IBM gained market share against its Unix peers, which is what the top brass at Big Blue get their bonuses from, presumably.

With each successive process shrink in the Power4 through Power7 generations, IBM has crammed more and more transistors onto the chips, pulling more and more features off the motherboard into the chip. The Power4 chip weighed in at 174 million transistors and dissipated 125 watts running at its top-end 1.3 GHz. The chip had a 1.44 MB shared L2 cache for the cores and had L3 memory tags on the chip as well, but the 32 MB L3 cache for the chip was external yet baked into the same package.

With the Power4+ shrink to 130 nanometers, IBM boosted the L2 cache for the dual-core processor to 1.5 MB and moved the DDR main memory controller onto the processor, boosting the transistor count to 184 million. The Power5 stayed on the same 130 nanometer processes, but implemented new cores with simultaneous multithreading (two threads per core), boosted the shared L2 cache to 1.9 MB, and jacked up the off-chip L3 cache to 36 MB.

With the 90 nanometer shrink to the Power5+, the heat dissipation on the chips had fallen to around 70 watts in a chip with 276 million transistors running at a top speed of 2.2 GHz. Because of the relatively low thermals, IBM could put two Power5+ chips in the same package to offer something competitive with then-emerging dual-core, quad-socket x64 servers.

With the Power6 chips, IBM did a major reworking of the Power instruction pipeline so it could buck the industry trend and jack up clock speeds instead of adding cores. The idea was to get more performance per core, which would translate into lower software costs per unit of performance - at least for software that is priced per core rather than per system or per socket. It is debatable as to how successful the Power6 chip was on this front, but the Power6 design included other innovations that made it interesting for existing and new workloads.

With 790 million transistors to work with thanks to the 65 nanometer shrink, IBM could - and did - wrap lots of extra stuff around the two cores in the Power6 design. Each core was given it own private 4 MB L2 cache, but the L3 cache was busted back down to 32 MB and remained off chip. That was to make room for other features on each Power6 core, such as a decimal math unit (for doing money math) and an AltiVec vector processing unit in addition to the two integer and two floating point units in the chip.

Top 5 reasons to deploy VMware with Tegile

Next page: shrink

More from The Register

next story
NSA SOURCE CODE LEAK: Information slurp tools to appear online
Now you can run your own intelligence agency
Azure TITSUP caused by INFINITE LOOP
Fat fingered geo-block kept Aussies in the dark
Yahoo! blames! MONSTER! email! OUTAGE! on! CUT! CABLE! bungle!
Weekend woe for BT as telco struggles to restore service
Cloud unicorns are extinct so DiData cloud mess was YOUR fault
Applications need to be built to handle TITSUP incidents
Stop the IoT revolution! We need to figure out packet sizes first
Researchers test 802.15.4 and find we know nuh-think! about large scale sensor network ops
Turnbull should spare us all airline-magazine-grade cloud hype
Box-hugger is not a dirty word, Minister. Box-huggers make the cloud WORK
SanDisk vows: We'll have a 16TB SSD WHOPPER by 2016
Flash WORM has a serious use for archived photos and videos
Astro-boffins start opening universe simulation data
Got a supercomputer? Want to simulate a universe? Here you go
Microsoft adds video offering to Office 365. Oh NOES, you'll need Adobe Flash
Lovely presentations... but not on your Flash-hating mobe
prev story

Whitepapers

Driving business with continuous operational intelligence
Introducing an innovative approach offered by ExtraHop for producing continuous operational intelligence.
Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
How to determine if cloud backup is right for your servers
Two key factors, technical feasibility and TCO economics, that backup and IT operations managers should consider when assessing cloud backup.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Choosing a cloud hosting partner with confidence
Download Choosing a Cloud Hosting Provider with Confidence to learn more about cloud computing - the new opportunities and new security challenges.