AMD: 'Bobcat' smaller, faster than Intel's Atom
Netbooks. Not servers. For now
The Bulldozer cores will also get that performance boost from a new twist on the turbo functions used by Intel and IBM in their respective Core/Xeon and Power6/7 processors, which allow a core to crank up its clock speed when other cores on the chip are not being used. Fruehe would not say how this turbo function will work, but said it would be more elegant than what has been done to date and that it will work dynamically, boosting performance or cutting it back as conditions on the system dictate.
Fruehe reiterated AMD's disdain for HyperThreading and any kind of simultaneous multithreading, contending that its approach of sharing certain components and yet having two real integer and floating point units (instead of virtual ones) was better for a lot of workloads. "Having 16 threads running on 16 cores is better than trying to cram 16 threads onto eight cores," Fruehe says emphatically.
The logical layout of the Valencia Opteron processor.
When pressed about how far this modular approach with the Bulldozer cores can go, Fruehe was not giving away much information, but did confirm that scaling the modules beyond 16 cores is "doable." It had better be, and with 32 nanometer wafer baking processes if AMD wants to keep on the Moore's Law curve and do a better job of keeping pace with rival Intel.
The first Bulldozers - presumably the high-end Interlagos parts - will sample at the end of this year to OEM partners in the server and workstation rackets, says Fruehe, and as the year goes on and AMD gets a better sense of how the 32 nanometer processes are working out at GlobalFoundries, it will provide some more precise launch dates for the Interlagos and Valencia Bulldozers. It seems likely that the pricier Interlagos parts that plug into the G34 sockets will come first, followed by a quarter or so by the Valencia parts that plug into the C32 sockets.
The "Zambezi" variant of the Bulldozer chip, aimed at the enthusiast desktop PC space as El Reg reported back in November 2009, is expected in 2011 as well. The Zambezi part is expected to come with four or eight cores and fit into an AM3 socket.
With the Bobcat cores for notebooks and netbooks, AMD is taking a K8 core and tweaking the heck out of it. Greg Hoepper, corporate vice president of design engineering who has managed the Bobcat design, says that the Bobcat core (which is a true, isolated core that does not share components) is "quite small" and that you could, in theory, put an "enormous number of these on a single die" if you wanted to. The early Bobcat implementations will put two cores on a die, with core counts going up form there.
While Hoepper is not giving out feeds and speeds on the chips, he did say this: "Bobcat is smaller than a single core Atom chip, and it has higher performance."
The Bobcat may be based heavily on the K8 core, but it doesn't cut any corners and has fully out-of-order execution of its instructions. The core sports a new set of logic for branch prediction. It supports the SSE1-3 SIMD instructions as well as the AMD-V virtualization extensions and the full AMD64 64-bit instruction set.
The Bobcat chip has 32 KB of L1 instruction cache that sits in front of the fetch and decode units. Below that are the integer, floating point, and address schedulers. The integer unit has two pipes, a load unit, and a store unit, and 32 KB of L1 data cache, while the floating point unit sits off to the side. Both the integer and floating point units share an on-chip L2 cache.