Intel to tell all about roaring 96GB/s QuickPath interconnect
Faster than Opteron sockets on steroids
SaaS data loss: The problem you didn’t know you had
You horrible cynics out there looked at Intel's mushy Montvale chip and scoffed. "That's the end of the Itanic."
Ah, but there's a fresh monster on the horizon known as Tukwila, and systems based on that puppy should fly if its brand new QuickPath interconnect arrives as expected. Next week Intel will disclose details on QuickPath at the International Solid State Circuits Conference in San Francisco. [It's like the Folsom Street Fair - Google at your own risk - but with more brain and less testicle torture - Ed.]
What will Intel say?
Well, according to the conference program, showgoers will hear about:
An Itanium processor is implemented in 8M 65nm CMOS and measures 21.5×32.5mm2. The processor has four dual-threaded cores, a system interface and 30MB of cache. Quickpath high-speed links enable peak processor-to-processor bandwidth of 96GB/s and peak memory bandwidth of 34GB/s.
We'll wait to hear a bit more from Intel before squaring QuickPath - formerly known as CSI - against Hypertransport 3.0, which can aggregate 41.6GB/s in two directions.
CSI should ship with the four-core Tukwila chip in 2008.
QuickPath whiz and analyst David Kanter is more willing to tackle the Hypertransport debate based on information he uncovered last year.
He tells us, "It looks like Tukwila's QPI links are running at 4.8GHz, which is about the same speed as Hypertransport 3 (maximum speed of 5.2GHz). Realistically, Intel will pack quite a bit more bandwidth on - because they are using 4+1 QPI links (4 to talk to other processors and 2 half links for I/O), compared to the 4 HT3 links that AMD will be using in future MPUs (that's right, no HT3 in the MP version of Barcelona). What's most impressive about Tukwila is the memory bandwidth - it has the same bandwidth as a full 4 socket Opteron system, all in one socket.
"Will Intel finally catch up with arch-rival IBM's POWER6? This is probably one of Intel's better chances since IBM took the lead with the Power5. It looks like a single Tukwila will probably have about the same performance on major benchmarks as a single Power6."
Dude? 30MB of cache? Maybe this is like the Folsom Street Fair after all.®
COMMENTS
CPUs need low latency, not high bandwidth
Fast CPUs need their interconnect to provide the data they need, fast (they're probably stalled waiting for that data). While too-low bandwidth gets in the way, the critical factor is more often the time until the first datum you asked for arrives ("latency"). Prof Roger Needham used to say that "bandwidth can be made by man, but God makes latency".
Nobody ever quotes the minimum latency which is practically achievable with their interconnect, because as the interconnects get more amazing, it generally gets worse... So this bandwidth claim, like almost all other such, is irrelevant.
BTW, optical enthusiasts should probably try to convince us that four electric/optical transitions in their favourite path doesn't add delay.
The 32nm 100GbE chip will be scalable like PCIe channels
A 4 pair RJ45 like optical plug will have 4 paths, each path doing 100gbit over optical, only 40gbit over copper.
Picture a completely changed computer architecture where your modularize memory, cpu(s), GPGPUs, etc and separte them from the motherboard. The motherboard has 1x, 4x, 8x, etc - 100GbE optical ports (on a new high density 8 fiber connector). computer components start coming in 5.25 modules (quarter, half high, full hight, double height, half/full length.
Want to add 6 GPGPU cards? No need for multiple PCIe 2.0 16x slots, you just add them on an optical bus.. Run out of optical ports and you get a off the shelf 100GbE switch module (made with the same inexpensive 32nm chips). Want to scale your system up, you buy another computer or host system and interconnect them with a 8x 100GbE interconnect.
If a group of vendors do come together and develop a low cost, mass produced 32nm 100GbE multipath chip that will someday come close approx 5x the cost of a 1GbE chipset, all this will be possible.
The problem is, Intel and other don't want this to happen. It means when you invest in your 2010 computer, it doesn't lose it's functionality in 2015, an enthusiast just keeps buying optically interconnected CPU / Memory / GPU modules and keeps plugging them in to create a vast resource pool that is created by Virtual Machine host software.
With the advent of mainstream 100GbE, and graphics subsytems interconnceted into a single switchable data backbone, you no longer need VGA/DVI/HDMI/Displayport - instead you'll have tiny display modules with optical ports that display any AV source in your network on ANY type of display.
People say optical will never make it into the home. All it takes is a low cost chip and mainstream adoption into medium range parts. USB3.0 will have an optical channel in the cable - there is little reason that in a few years complete product lines can interconnect this way too.
No one is going to bring us an optical PC evolution other than IBM +partners as far as I see it. Few others see the potential of a scaling 100GbE based technology brought down to the home market. Everyone else is thinking only enterprise high $ paying customers.
RE: Christopher E. Stith
"...They can't attack AMD in the segment the Itanium actually occupies, because AMD doesn't field any processors in that segment. IBM and Sun do, but where are the comparisons to Power, UltraSparc, and Niagara? Are they omitted because the Itanium is not as impressive against those?...."
Well, actually the whole UNIX segment is under attack from cheaper x86 kit eating upwards into their space. Applications that traditionally needed a hulking server can now be run on 4-way Xeons or Opterons. So the UNIX vendors need to defend against this by showing their enterprise servers are going to be cost-competitive against x86 kit, so Intel is showing how Itanium will be.
The article does mentions Tukzilla (thanks Mr Morley, I like that one!) as comparable to Power6. Please explain why you would need to compare it to UltraSPANKed, the SPARC chips are so far behind performance-wise the old PA-RISC chips caned them. Likewise Niagara, which is only good at multi-threaded apps using small threads such as webserving - it barfs doing serious work like Oracle, and the licensing costs would kill the idea before even the rubbish performance. In the true enterprise space, Tukzilla's only competition is going to be Power6.

IT infrastructure monitoring strategies
Agentless Backup is Not a Myth
Top 10 SIEM implementer’s checklist
Steps to Take Before Choosing a Business Continuity Partner
Enabling efficient data center monitoring