Intel sends 'Poulson' Itaniums to the shrink

'Designed with the future in mind'

Maximizing your infrastructure through virtualization

Would you like to share my socket?

The Poulson chip has a combined 54 MB of on-die memory, including L1 and L2 caches, tags and registers, and directory caches. 50 MB of this is in static RAM caches. There is 256 KB of "mid-level" data cache and 512 KB of "mid-level" instruction cache (what you and I would call L2 but for some reason Intel did not) on each core, plus 32 MB of shared L3 cache. That L3 cache looks like it is broken into two 16 MB segments, and in fact, Poulson looks like two four-core chips that have been interconnected (as you would expect). It is not clear how much L1 cache is on each Poulson core and how much is used for tags, registers, and directories. (We'll try to find out at ISSCC.)

One of the delays in getting the modified Tukwila Itanium 9300s into the field in 2008 and 2009 was that server makers wanted Tukwila, Poulson, and Kittson to share the same socket. And as promised, Poulson chips will plug into the LGA 1248 sockets used by Tukwila, and so will Kittson. So upgrades will be easy. Hopefully, Intel has built some bandwidth headroom into the Itanium platform.

McInerney said that Intel did, in fact, have some headroom in the "Boxboro" chipsets and memory boards that are shared by Itanium 9300 and Xeon 7500 systems when Tukwila chips came out last year. That is why Intel has been able to crank up the QPI speeds from the 4.8 GT/sec of the Tukwilas to the 6.4 GT/sec of the Poulsons. Assuming that the future Xeons and Itaniums will need more bandwidth, then the kicker to the Boxboro chipset will go even higher. Base 2 math would suggest that 9.6 GT/sec is the next stop on the QPI bus. For all we know, this is already cooked into the Boxboro chipsets, but just not activated.

Here's what the new Poulson core looks like:

Intel Poulson Itanium Core

The layout of the Poulson Itanium core

The big architectural change with the Poulson Itaniums is that the EPIC very large word instruction parallelism packaging mechanism has been made into a double-wide, moving from six-wide instruction processing to twelve-wide. In theory, and providing the application's mix of instructions works out right, this should come close to doubling the performance of Poulson cores compared to Tukwila cores, clock for clock and core for core. Which is why I don't think Intel is going to boost clock speeds on the Poulson Itaniums compared to the 1.33 GHz to 1.73 GHz of the Tukwilas. The TurboBoost speed could go up, and well beyond the 1.46 GHz to 1.86 GHz range of the Tukwilas.

With twice as many cores, processing twice as many instructions, and possibly with twice as many HyperThreads, the Poulson chips should yield anywhere from three, four, or five times the performance of the Tukwilas at the socket level. It depends on the threads and the efficiency of the twelve-wide EPIC instruction packaging. The eight other Itanium chips to date have all been six-wide chips, and it is unclear how software will take to twelve-wide pipes.

What I can tell you is that customers will not have to recompile their applications when they move to Poulson chips. "We are not anticipating that people will need to do a recompile," explains McInerney. He did add that just as is the case with any new processor, recompiling is often necessary to squeeze every drop of performance out of a system. But the performance comparisons that Intel will be making when Poulson gets closer to launch will be for code that was compiled on prior generations of Itaniums and plunking it on the Poulson systems unchanged.

The Poulson cores also have new data and instruction pipelines, a new floating point pipeline, and a new instruction buffer. The chip also has a number of dynamic power management features that gate power usage on elements of the Itanium chip and now the memory controllers and memory subsystems. Leakage current, power draw when idle, and power draw under load have all been reduced on the Poulson chip. Take a look:

Intel Poulson Itanium Power Draw

Tukwila and Poulson power management (lower is better)

In this chart, Intel shows the ratio of Tukwila to Poulson on several power scaling metrics. The blue bars show Tukwila and the red bars show what would happen if the Tukwila chip was unchanged and just implemented in a 32 nanometer process. The green bars show the effect of the design changes inside Poulson on these same metrics. While Poulson only reduces power leakage by 30 per cent better than a 32 nanometer Tukwila, the Poulson chips cut back on idle power usage by 70 per cent better and cut back on power used under load (that's the TDP Activity data) by 60 per cent more. In general, the power lost or consumed for the Poulsons for these metrics is about a fifth of what it is on the real 65 nanometer Tukwilas.

Finally, Poulson will include a slew of new error detection, correction, and prevention technologies not in the current Tukwila Itanium chips. Intel has added error detection for floating point instructions and expended soft error correction and boosted cache error coverage. The chip also allows for the logging of more information about errors in the chips to improve recovery, sometimes automagically.

Intel and its main Itanium partner, HP, are no doubt hoping that the Poulson specs will put to rest any talk about the impending death of Itanium.

"Intel's commitment, as evidenced by this development effort, is strong and it is unwavering," McInerney said on the call.

Don't expect for some in the IT market to believe it. They never will.

Intel is not talking about when Poulson chips will be delivered, but it seems likely that it will show up in early 2012, with Kittson in early 2014. ®

The Power of One eBook: Top reasons to choose HP BladeSystem

More from The Register

next story
Sysadmin Day 2014: Quick, there's still time to get the beers in
He walked over the broken glass, killed the thugs... and er... reconnected the cables*
Amazon Reveals One Weird Trick: A Loss On Almost $20bn In Sales
Investors really hate it: Share price plunge as growth SLOWS in key AWS division
Auntie remains MYSTIFIED by that weekend BBC iPlayer and website outage
Still doing 'forensics' on the caching layer – Beeb digi wonk
SHOCK and AWS: The fall of Amazon's deflationary cloud
Just as Jeff Bezos did to books and CDs, Amazon's rivals are now doing to it
BlackBerry: Toss the server, mate... BES is in the CLOUD now
BlackBerry Enterprise Services takes aim at SMEs - but there's a catch
The triumph of VVOL: Everyone's jumping into bed with VMware
'Bandwagon'? Yes, we're on it and so what, say big dogs
Carbon tax repeal won't see data centre operators cut prices
Rackspace says electricity isn't a major cost, Equinix promises 'no levy'
prev story


Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
Application security programs and practises
Follow a few strategies and your organization can gain the full benefits of open source and the cloud without compromising the security of your applications.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Securing Web Applications Made Simple and Scalable
Learn how automated security testing can provide a simple and scalable way to protect your web applications.