AMD locks and loads 'Istanbul' six-shooter

Gunning for Dunnington, Nehalem

High performance access to file storage

Advanced Micro Devices, after weeks of hinting that its six-core "Istanbul" Opteron processors were right around the corner, is finally firing the kickers to its "Shanghai" quad-core Opterons right at the new and forthcoming Nehalem family of workstation and server chips from archrival Intel.

With the Istanbuls, AMD and Intel have entered the marketing equivalent of the Mutara Nebula, because provided there are no bugs in the current generations of Opteron and Xeon processors, the odds will be even. (Well, more or less.)

Intel has a quad-core and six-core "Dunnington" Xeon 7400 that does not have simultaneous multithreading (what Intel calls HyperThreading, which turns each physical core into two virtual cores and boosts performance by around 30 to 40 percent on many workloads) and the quad-core "Nehalem EP" Xeon 3500 and Xeon 5500 processors for single-socket and two-socket servers, respectively. Nehalem EPs, also known as the "Gainestown" processors, use Intel's HyperTransport-like QuickPath Interconnect for linking processors, memory, and I/O, and have four cores, each with hyperthreading.

AMD has decided against using simultaneous multithreading in the Opteron processors, so the Istanbul chips announced today with six cores on a single die do not have their performance goosed with virtualized instruction streams. It is hard to imagine that the Dunningtons will be able to keep pace with the Istanbuls on big x64 iron, given the benefits of the HyperTransport point-to-point interconnect compared to the old frontside bus architecture used in the Dunningtons. (These are the last Intel chips that will use FSB instead of QPI to link processors to memory and I/O.)

With the two-socket server being the workforce of the IT industry, the main battle between these two enterprises will be fought between the Xeon 5500s and the Opteron 2400s, as the Istanbul chips for two-socket boxes are called. But make no mistake: until Intel gets its "Nehalem EX" Xeon 7500 chips to market in early 2010, AMD is going to try to open fire with the high-end Opteron 8400s in the four-socket and eight-socket space.

There is more than one battle going on in this economic meltdown dust cloud, and this time, the Genesis Effect might be a little something called $5tn in global stimulus spending. (That's probably taking a metaphor too far. It happens.)

The Istanbul Opterons contain 904 million transistors, which consist of six cores, each with 64 KB of L1 data cache, 64 KB of L1 instruction cache, and 512 KB of L2 cache per core. The chip, which is implemented in a 45 nanometer silicon-on-insulator process and manufactured by AMD's fab spinoff, GlobalFoundries, in its Dresden, Germany fab. Each chip also has 6 MB of L3 cache that is shared by all of the cores, as well as the full AMD-V virtualization and AMD-P power management feature sets. (AMD-V consists of rapid virtualization indexing, tagged TLB, and extended migration, while AMD-P consists of smart fetch, power cap, and CoolCore features.)

The Istanbul chips use the same Socket F processor socket as the earlier Rev F Opterons, which is comprised of a 1,207-pin organic land grid array (LGA). In plain English (well, American anyway), that means the Istanbuls plug into all of the same machines that quad-core Barcelona and Shanghai chips do as well as prior dual-core Rev F Opterons. The Istanbuls will also plug into AMD's future "Fiorano" platform, which is based on a homegrown SR5690/SP5100 chipset, according to the company.

The Istanbul Opteron has a die size of 346 square millimeters with those 904 million transistors. The Nehalem EP weighs in at 731 million transistors (also implemented in a 45 nanometer process, but in this case, Intel's own cooking) and has a die size of 263 square millimeters. If there is a direct relationship between the cost of making a chip and its size and an inverse relationship between the size of a chip and its improving yields, you can see why Intel has decided to deploy HyperThreading on its chips, and the wonder is why AMD hasn't done its own variant of HyperThreading after all of these years.

The Nehalem EP fits into the 1,366-pin FC-LGA socket from Intel. The future eight-core Nehalem-EX chip from Intel, slated for initial production late this year, will have a hefty 2.3 billion transistors; the die size has not been divulged, but it is going to be a fat chip, no doubt.

Each Istanbul core, like all prior Opterons, includes on-chip main memory controllers - in this case, supporting DDR2 main memory like the prior quad-core "Barcelona" and Shanghai Opterons. The controllers, which run at 2.2 GHz, support registered ECC DDR2 main memory running at 533 MHz, 667 MHz, and 800 MHz. The memory controllers run at the same speed regardless of the clock speed of the processor - this is one of the things that makes putting memory controllers onto chips tricky - and deliver up to 12.8 GB/sec of memory bandwidth per Rev F socket.

The Istanbul chips have three HyperTransport 3.0 point-to-point links, with up to 19.2 GB/sec of bandwidth per link, which are used to talk to other processors and I/O in a chip complex. The Istanbuls also include a new feature called HT Assist, which allows a chip in a complex to figure out which one it needs to share data with and only send requests for information to that chip.

The HT Assist feature works like this: 1 MB of L3 cache is reserved as a directory for all of the cache lines used in the system, so the chips don't have to probe the caches. Of course, you have to give up some L3 cache as well, which can affect performance for other things, like Java. Presently, an Opteron socket that needs some data it doesn't have in its cache broadcasts probe filters to all sockets in the complex, which puts a lot of overhead on the HyperTransport interconnect.

On memory-intensive benchmarks like Stream, the throughout of an Opteron server can increase by as much as 60 per cent thanks to HT Assist, and AMD is anticipating big gains for database workloads, too. The HT Assist function doesn't need to be turned on in two-socket boxes (there's only one pipe between the two chips, so they know who they are talking to already) and the feature is implemented at the BIOS level of the systems, so there is no need to tweak operating systems or hypervisors to take advantage of the HT Assist feature.

High performance access to file storage

Next page: Shanghai surprise

More from The Register

next story
Seagate brings out 6TB HDD, did not need NO STEENKIN' SHINGLES
Or helium filling either, according to reports
European Court of Justice rips up Data Retention Directive
Rules 'interfering' measure to be 'invalid'
Dropbox defends fantastically badly timed Condoleezza Rice appointment
'Nothing is going to change with Dr. Rice's appointment,' file sharer promises
Cisco reps flog Whiptail's Invicta arrays against EMC and Pure
Storage reseller report reveals who's selling what
Bored with trading oil and gold? Why not flog some CLOUD servers?
Chicago Mercantile Exchange plans cloud spot exchange
Just what could be inside Dropbox's new 'Home For Life'?
Biz apps, messaging, photos, email, more storage – sorry, did you think there would be cake?
IT bods: How long does it take YOU to train up on new tech?
I'll leave my arrays to do the hard work, if you don't mind
prev story


Mainstay ROI - Does application security pay?
In this whitepaper learn how you and your enterprise might benefit from better software security.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Mobile application security study
Download this report to see the alarming realities regarding the sheer number of applications vulnerable to attack, as well as the most common and easily addressable vulnerability errors.