Original URL: http://www.theregister.co.uk/2009/06/01/amd_launches_istanbul/

AMD locks and loads 'Istanbul' six-shooter

Gunning for Dunnington, Nehalem

By Timothy Prickett Morgan

Posted in The Channel, 1st June 2009 16:34 GMT

Advanced Micro Devices, after weeks of hinting that its six-core "Istanbul" Opteron processors were right around the corner, is finally firing the kickers to its "Shanghai" quad-core Opterons right at the new and forthcoming Nehalem family of workstation and server chips from archrival Intel.

With the Istanbuls, AMD and Intel have entered the marketing equivalent of the Mutara Nebula, because provided there are no bugs in the current generations of Opteron and Xeon processors, the odds will be even. (Well, more or less.)

Intel has a quad-core and six-core "Dunnington" Xeon 7400 that does not have simultaneous multithreading (what Intel calls HyperThreading, which turns each physical core into two virtual cores and boosts performance by around 30 to 40 percent on many workloads) and the quad-core "Nehalem EP" Xeon 3500 and Xeon 5500 processors for single-socket and two-socket servers, respectively. Nehalem EPs, also known as the "Gainestown" processors, use Intel's HyperTransport-like QuickPath Interconnect for linking processors, memory, and I/O, and have four cores, each with hyperthreading.

AMD has decided against using simultaneous multithreading in the Opteron processors, so the Istanbul chips announced today with six cores on a single die do not have their performance goosed with virtualized instruction streams. It is hard to imagine that the Dunningtons will be able to keep pace with the Istanbuls on big x64 iron, given the benefits of the HyperTransport point-to-point interconnect compared to the old frontside bus architecture used in the Dunningtons. (These are the last Intel chips that will use FSB instead of QPI to link processors to memory and I/O.)

With the two-socket server being the workforce of the IT industry, the main battle between these two enterprises will be fought between the Xeon 5500s and the Opteron 2400s, as the Istanbul chips for two-socket boxes are called. But make no mistake: until Intel gets its "Nehalem EX" Xeon 7500 chips to market in early 2010, AMD is going to try to open fire with the high-end Opteron 8400s in the four-socket and eight-socket space.

There is more than one battle going on in this economic meltdown dust cloud, and this time, the Genesis Effect might be a little something called $5tn in global stimulus spending. (That's probably taking a metaphor too far. It happens.)

The Istanbul Opterons contain 904 million transistors, which consist of six cores, each with 64 KB of L1 data cache, 64 KB of L1 instruction cache, and 512 KB of L2 cache per core. The chip, which is implemented in a 45 nanometer silicon-on-insulator process and manufactured by AMD's fab spinoff, GlobalFoundries, in its Dresden, Germany fab. Each chip also has 6 MB of L3 cache that is shared by all of the cores, as well as the full AMD-V virtualization and AMD-P power management feature sets. (AMD-V consists of rapid virtualization indexing, tagged TLB, and extended migration, while AMD-P consists of smart fetch, power cap, and CoolCore features.)

The Istanbul chips use the same Socket F processor socket as the earlier Rev F Opterons, which is comprised of a 1,207-pin organic land grid array (LGA). In plain English (well, American anyway), that means the Istanbuls plug into all of the same machines that quad-core Barcelona and Shanghai chips do as well as prior dual-core Rev F Opterons. The Istanbuls will also plug into AMD's future "Fiorano" platform, which is based on a homegrown SR5690/SP5100 chipset, according to the company.

The Istanbul Opteron has a die size of 346 square millimeters with those 904 million transistors. The Nehalem EP weighs in at 731 million transistors (also implemented in a 45 nanometer process, but in this case, Intel's own cooking) and has a die size of 263 square millimeters. If there is a direct relationship between the cost of making a chip and its size and an inverse relationship between the size of a chip and its improving yields, you can see why Intel has decided to deploy HyperThreading on its chips, and the wonder is why AMD hasn't done its own variant of HyperThreading after all of these years.

The Nehalem EP fits into the 1,366-pin FC-LGA socket from Intel. The future eight-core Nehalem-EX chip from Intel, slated for initial production late this year, will have a hefty 2.3 billion transistors; the die size has not been divulged, but it is going to be a fat chip, no doubt.

Each Istanbul core, like all prior Opterons, includes on-chip main memory controllers - in this case, supporting DDR2 main memory like the prior quad-core "Barcelona" and Shanghai Opterons. The controllers, which run at 2.2 GHz, support registered ECC DDR2 main memory running at 533 MHz, 667 MHz, and 800 MHz. The memory controllers run at the same speed regardless of the clock speed of the processor - this is one of the things that makes putting memory controllers onto chips tricky - and deliver up to 12.8 GB/sec of memory bandwidth per Rev F socket.

The Istanbul chips have three HyperTransport 3.0 point-to-point links, with up to 19.2 GB/sec of bandwidth per link, which are used to talk to other processors and I/O in a chip complex. The Istanbuls also include a new feature called HT Assist, which allows a chip in a complex to figure out which one it needs to share data with and only send requests for information to that chip.

The HT Assist feature works like this: 1 MB of L3 cache is reserved as a directory for all of the cache lines used in the system, so the chips don't have to probe the caches. Of course, you have to give up some L3 cache as well, which can affect performance for other things, like Java. Presently, an Opteron socket that needs some data it doesn't have in its cache broadcasts probe filters to all sockets in the complex, which puts a lot of overhead on the HyperTransport interconnect.

On memory-intensive benchmarks like Stream, the throughout of an Opteron server can increase by as much as 60 per cent thanks to HT Assist, and AMD is anticipating big gains for database workloads, too. The HT Assist function doesn't need to be turned on in two-socket boxes (there's only one pipe between the two chips, so they know who they are talking to already) and the feature is implemented at the BIOS level of the systems, so there is no need to tweak operating systems or hypervisors to take advantage of the HT Assist feature.

Shanghai surprise

There are five Istanbul chips coming out of the chute today, with others expected as the chip ramps. They are all rated at the 75-watt standard thermals for the current crop of Rev F Opterons. Prices shown below are per-chip prices if customers buy in 1,000-unit trays:

Pat Patla, manager of the server and workstation chip business at AMD, said that AMD was by no means dropping its Shanghai Opterons, but would rather position the chips as full-featured processors that offered better value (meaning they cost less) and, at least for now, lower power consumption on some SKUs. Indeed, as we already reported, AMD last week chopped prices on its Shanghai Opterons to align them with the Istanbul chips and to better compete with Intel's Xeon chips. (The price cuts were substantial on Opteron 2300s and less impressive on the Opteron 8300s, where AMD is not feeling the heat of competition quite as intensely.)

Patla added that AMD is fixing to get 40-watt Extremely Efficient (EE), 55-watt Highly Efficient (HE), and 105-watt turbocharged Special Edition (SE) versions of the Istanbul chips into the field in the third quarter of this year. Patla also said that it will also put faster Istanbul parts out as well in the third quarter as it sorts through its bins.

The five Istanbul chips announced today will be "widely available" this month, and systems using the chips are expected to roll out from the major server makers throughout the second half of the year. Given that the Istanbuls plug into the same machines as the Barcelona and Shanghai chips, this is more a matter of software qualification than anything else. Machines will also have to get a BIOS flash in most cases so they can see the six cores on the Istanbul chip.

AMD estimates that about ten per cent of its customers could upgrade Rev F boxes to Istanbul processors, but said that most companies and organizations that buy servers get a box, certify it to run a particular stack, and don't touch it until they get rid of it.

AMD is a little vague on the performance that the Istanbul chips bring to bear, but depending on the workload, it is looking like anywhere from 20 to 40 per cent more oomph, with floating point performance showing the least improvement over top-end Shanghai parts. The Shanghai parts have higher clock speeds - 2.9 GHz in the 75-watt thermal envelope - compared to 2.6 GHz for the Istanbuls, which have two more cores.

Perhaps the most interesting thing about the Istanbul chips is that AMD is shipping them in their "first silicon" state, and that the initial chips that came out of the factory were found to be ready for full production. This is not the usual state of affairs in the chip business, which usually runs multiple steppings as bugs are found and fixed. But Patla said that the changes AMD made in its design, fabrication, and testing processes in the wake of the bug found in the Barcelona chips gave it the confidence that Istanbul was ready for market.

It will be interesting to see if the server makers agree with that assessment, or if they take a while to do their qualifications for the chip, even though it "just drops in". Paul Gottsegen, vice president of marketing for Enterprise Storage and Servers at Hewlett-Packard, heaped praise on AMD's fast ramp of Istanbul.

"Coming on the heels of Shanghai, to land your first silicon with Istanbul is like hitting a hole in one," Gottsegen said in a briefing he attended with AMD representatives from Europe. But Gottsegen didn't say that HP was shipping Istanbul servers on day one, either. There's excited, and then there's crazy, apparently. ®