Original URL: http://www.theregister.co.uk/2010/02/09/intel_tukwila_feeds_speeds/

Intel's 'Tukwila' Itaniums - hot n' pricey

How much for an upgrade?

By Timothy Prickett Morgan

Posted in Servers, 9th February 2010 08:02 GMT

Analysis As El Reg duly reported earlier today, Intel took the wraps off its long awaited and many times tweaked "Tukwila" quad-core Itanium 9300 processors for midrange and high-end servers. But let's take a look at the feeds and speeds of the chip itself and how the lineup compared to the prior Itanium 9100 series.

There are five variants of the Itanium 9300 series, compared to seven in the prior 9100 series. The chips as delivered were basically the same as what people were whispering about for the past year.

What we knew for sure about Tukwila was that it would have four cores, two threads per core with HyperThreading, 6 MB of L3 cache per core (for a total of 24 MB), integrated DDR3 main memory controllers - all implemented in a fairly ancient 65 nanometer process with some 2 billion transistors. Tukwila's clock speeds were expected to be in the range of 1.2 GHz to 2 GHz, with top-end parts burning at 170 watts, and performance was hinted to be about twice that of the dual-core Itanium 9100s.

Here are their basic feeds and speeds of the Itanium 9300s announced today, including the single unit price when OEMs buy them at list price in 1,000-unit quantities:

Intel Itanium 9300

The Intel Itanium 9300 family of server processors

As you might expect, Intel is trying to hold its price points even as it uses Moore's Law to double up the cores and add threads to the Itanium processor. In fact, each new Itanium 9300 processor costs a little bit more than the Itanium 9100 processor it replaces in the lineup. The modest price increase is generous compared to the prior product line, and with the chips being binary compatible with other Itaniums, moving operating systems and applications to the new chips is not an issue.

(Wringing the full amount of performance out of any new chip does often require recompilation and other tweaking, however, and the Itanium 9300, with its radically different chipset and QuickPath Interconnect, will be no different in this regard).

The Tukwilas will give customers running HP-UX, Windows, Linux, OpenVMS, and NonStop operating systems more threads, more cache, and slightly higher clock speeds in a few cases. But the 800 per cent increase in interconnect bandwidth, 500 per cent more memory bandwidth, and 700 per cent increase in main memory capacity using 16 GB DDR3 DIMMs will presumably make customers not mind a slight price rise on raw chips to get around twice the throughput.

What's the upgrade cost?

The issue is really what server makers will charge for upgrades for system boards using the Tukwila chips - if they even make upgrades possible, considering the changes in processor sockets, memory, and interconnect in making the jump from the prior dual-core "Montecito" Itanium 9000s and "Montvale" Itanium 9100s to the Tukwila Itanium 9300s.

Intel did not, by the way, cut the prices on these older Itanium chips in half, so customers who want better bang for the buck on their Itanium systems have to move forward.

The Itanium 9300 chips also sport better power management and the quiescing of cores inside the chip, which allows other cores in the Itanium 9300 to be slightly overclocked - what Intel calls Turbo Boost, a feature that is available in some of its Xeon family of server chips. The Turbo Boost is fairly modest, with a 7.5 to 9.8 per cent clock bump on the four-core versions of the Itanium 9300. The dual-core version of the chip, the Itanium 9310, doesn't have Turbo Boost at all and has its on-chip L3 cache cut down to 10 MB (5 MB per core).

This chip costs $946, and it will no doubt be used by Hewlett-Packard in blade servers where thermals are an issue. At 130 watts, it is going to be tough to put two of these Itanium 9310s on a single blade, and it would seem to be impossible to cram two of the Itanium 9340 (1.6 GHz) or 9350 (1.73 GHz) chips onto a blade, considering that each chip dissipates 185 watts.

That said, the performance per watt on the Tukwilas should be a little better than with the dual-core Montvales. The top-end Itanium 9150N had 24 MB of L3 cache for its two cores and burned 104 watts running at 1.66 GHz and 1.6 GHz. Depending on how clock speeds translate into performance and how you compare different Montvale and Tukwila chips at the high end of the line, this works out to a 10 to 15 per cent improvement in performance per watt.

Still, it is hard for Itanium to compete in terms of bang for the buck with the Nehalem and Westmere family of Xeon processors, the former announced last March and the latter due this March. The current X5570 has four cores running at 2.93 GHz (with eight threads), 8 MB of L2 cache on chip (no L3 cache), a 95 watt thermal envelope, and costs $1,386 each when bought in 1,000-unit quantities.

The one interesting thing that Intel has not talked about is what happens to performance on the Tukwilas with applications that are sensitive to L3 cache memory(like database transaction processing, for instance). When you compare roughly like-for-like Tukwila and Montvale chips, you have about the same cache memory in the SKUs, but Tukwila has half the cache memory per core.

This may have been the real reason that Intel pushed out Tukwila to graft on the buffered memory architecture - called the Scalable Memory Buffer) that is coming out later this quarter with the eight-core "Beckton" Nehalem-EX processors and is now part of the Itanium 9300 design. That buffered memory sits between standard DDR3 DIMMs and the memory controller on either the Itanium or Nehalem-EX chips. It actually sits on memory cards, apparently, that plug into the system boards. This memory buffering helps the Itanium 9300s to support 1 TB of main memory on a four-socket server, and presumably, it makes up for the smaller L3 cache memories.

By comparison, IBM's just announced Power7 chip can support only 512 GB across four sockets. But that Power 750 machine announced today also has 32 cores in those four sockets, with 128 threads, compared to the Tukwila's 16 cores and 32 threads. IBM's cores are also clocking in at roughly twice the clock speed too. (Although you have to be careful about comparing clocks across chip architectures).

It will be interesting to see how the performance of these two midrange boxes stack up to each other and to Nehalem-EX and systems based on Advanced Micro Devices' "Magny-Cours" twelve-core processors. The four-socket market is going to get very competitive - and very quickly. ®