Original URL: https://www.theregister.com/2001/10/17/freds_and_threads_knot_amds/

Freds and Threads knot AMD's Hammer

Waiting for The Beast

By Andrew Orlowski

Posted in Channel, 17th October 2001 06:40 GMT

MPF You know you're getting older when the Freds you meet at chip conferences start looking younger.

The Freds you expect to meet should look something like this: aged late forties, possibly early fifties; taciturn, but always ready with an anecdote about the 8-bit days when processors had no MMUs; grey haired, and possibly with a few silver streaks of soldering iron ingrained in their palms, much like the residuals of mercury that eventually turned old milliners into Mad Hatters.

AMD's Fred: Fred Weber, isn't like that at all. Dammit he's our age, and at Microprocessor Forum he was everywhere at once.

When we caught up with him he had a wee bone to pick. He wanted to explain in some detail why AMD's NUMA-like x86-64 wouldn't be slow in the way that Intel's NUMA-like IA-64 (some of whose team he's poached) would be, he said.

"You can call it NUMA-like, as long as you emphasize the '-like' more than the 'NUMA'," he told us.

You see NUMA, or cache-coherent non uniform memory access MPs still have a bad name. Sun, which now also has a NUMA-like design in its new Hello-Pino UltraSPARC IIIi, said very much the same thing. They just want to be '-liked', too.

"You've got it all wrong," he didn't say.

No, what he said was:

"Even in a single CPU machine the difference between near and far memory accesses is far smaller than a cache miss"

NUMAs were dogged by latencies in having to fetch memory from a remote location. This is a good as fixed, said AMD (and Sun).

"The latency in a four processor system is 140ns, and in an eight-way is 160ns. These are very respectable latencies compared to ccNUMA systems today," he said.

"We have the memory controller running at the same frequency as the chip. We can pump 2.1 GB/sec through a four-way, but really it's 8 GB/sec. Bus systems can get 400 MB/ sec. So this is a new architecture, and it's really the end of the bus." *

The OS now booting on Platform Four....
Given that Hammer will eventually supersede Athlon on the desktop, we enquired about the state of OS and tools support for x86-64.

Weber said that AMD was working with SuSE and others to provide a simulator on Linux, and that an alpha gcc compiler was ready. Sure, but what about The Beast?

"VirtuTech is an x86-64 simulator and on that, you can boot Windows XP," he said.

C'mon. We weren't falling for that kind of kidology. You can boot anything on a simulator - so was XP really running natively on x86-64?

"That's something Microsoft can tell you," he said. He wouldn't.

But um, hasn't Hammer support already been identified in Windows XP source code header files, as we've reported?

"Glad to see you've noticed!" he beamed.

Weber was equally not very forthcoming about Solaris on x86-64, beyond saying that it was "very solid" and acknowledging there was no reason why the x86 version couldn't make it onto Hammer, too.

Weber dropped a fairly strong hint that future Hammer versions would support the multiple cores on a die approach, or CMP, used in POWER4, PA-RISC8800 and MAJC.

"If you look closely you'll see the label 'CPU1', so it supports CMP already," he said. "But we're not announcing anything."

And threading?

"Threading is very interesting, but we've looked at all of these - CMP, SMT - and none of these is going to be an overnight sensation".

And with that, he was off.®

* Is it the end of the bus?

Or is the bus simply at the end of the line?

Having lived in both San Francisco, Ca. and Manchester, England we're familiar with the bus stopping halfway to its destination, and the driver turning the lights out and wandering off to light a joint leaving you to walk the rest of the way home. Maybe there's a metaphor in there, struggling to get out.

Related DIY

AMD nails Hammer specs