Intel pushes Nehalem EXs into 2010
Keeping Tukwila company
With Advanced Micro Devices admitting that it's getting ready to launch its "Istanbul" Opteron six-shooter, Intel can't afford to let AMD monopolize all the talk about x64 processors. And that is why Intel hosted a conference call today about its eight-core "Nehalem EX" processor for four-socket and larger servers, even though the real news was that the company's partners would not be able to get Nehalem EX systems into the field until early 2010.
Or maybe even a little later than early 2010 for some of the larger Nehalem EX servers. But before we get into all that, let's go over the feeds and speeds Intel divulged today as it previewed the Nehalem EX processors.
These high-end server chips - which will probably be called the Xeon 7500s when they come to market and were once known by the code-name "Beckton" - have over 2.3 billion transistors and will be implemented in Intel's 45 nanometer high-k metal gate technology, the same process that Intel used to make the the "Nehalem EP" chips used for two-socket servers (these were launched back on March 30).
The Nehalem EX chips will pack up to eight processor cores onto a single die, with each core equipped with HyperThreading, Intel's implementation of simultaneous multithreading. (This technology lets each core look like two cores to the systems software on the box using the chip. It's a kind of instruction stream virtualization).
The Nehalem EX chips will also sport the Turbo Boost technology that made its debut on the Nehalem EPs, which allows unused cores in the chip to be quiesced and the remaining cores to have their clock speeds boosted a little bit. The Nehalem EX will also sport 24 MB of shared L3 cache memory, which is actually distributed with each of the cores, as you can see from the chip layout diagram to the left.
The Quick Path Interconnect (QPI) point-to-point interconnection technology that debuted in the desktop Core i7 and server Nehalem EP chips will be more fully deployed with the Nehalem EX systems. The four QPI ports on the Nehalem EX sockets as well as a pair of I/O hubs in the "Boxboro-EX" chipset that goes with the octo-core chip allows for four-socket or eight-socket systems (with as many as 32 or 64 cores and twice as many threads) to be created "gluelessly," meaning that you don't have to architect another chipset.
This was the promise that AMD always held out for its own Opteron processors and their HyperTransport interconnect, but very few vendors actually delivered machines that put four two-socket boards together as the Opteron design has allowed from day one.
Boyd Davis, general manager at Intel's Server Platforms Group marketing, said that the Nehalem EX design will use standard unbuffered DDR3 main memory, rather than fully buffered DIMMs, and will put memory buffers somewhere between the on-chip DDR3 memory controllers on the Nehalem cores and the memory DIMMs using something Intel calls the Scalable Memory Interconnect. With eight cores per chip and up to 16 memory slots per socket, the memory used in conjunction with the Nehalem EX chips need buffering, said Davis.
But the economics were apparently better to put the buffering somewhere in the system rather than on the memory DIMMs themselves. Davis would not elaborate much more on this technology, except to say that he would not talk about the interfaces used to link these buffers to the DDR3 main memory or how much heat they generate. He did say that it was part of the system and could not be circumvented.
In comparison with the current four-core and six-core "Dunnington" Xeon 7400 processors, announced last September, Davis said that the Nehalem EX systems would support about twice the memory, would have more reliability features at both the chip and system level, would deploy 2.7 times as many threads and 1.5 times the cache memory, and would expand out across twice as many sockets in a single system image. Most significantly, the Nehalem EX systems, based on some internal Intel tests, will have up to nine times the memory bandwidth of those Dunnington machines and their crufty and slow front side bus architecture.
Sponsored: Hyper-scale data management