Feeds

AMD's 'Revolution' will be televised ... if its CPU-GPU frankenchip Kaveri is a hit

Graphics and x86 cores are all just 'compute units' now for our games, videos and apps

The Power of One Brief: Top reasons to choose HP BladeSystem

On the die, memory matters

For one thing, Macri told The Reg in a sit-down after the group briefing, AMD has doubled the size of the branch target buffer in Kaveri over that of its predecessor, Richland.

"It also has one new cute feature in that when it misses in the [instruction] cache now, we kick off a prefetch," he said. "And prefetches are neat in that if they're 100 per cent accurate, then it's super-cool. Getting a memory reference out early helps the memory controller slide it into a place that's idle."

Improving the prefetching of instructions from RAM helps further alleviate the pain of an instruction cache miss: this happens when a core attempts to touch program code outside the cache, forcing it to stall while waiting for the memory controller to copy fresh instructions into the cache. However, by being a bit smarter, "when you shoot a prefetch out, it's like, 'hey, you can take a little longer'," he said.

Such tweaks have allowed Kaveri's designers to boost the CPU's overall instructions-per-clock (IPC) performance by helping out the memory subsystem – which, by the way, has a clock rate of 2400MHz, up from Richland's 2133MHz.

And speaking of memory, Kaveri has two 64-bit, fully independent memory channels. "We do stripe across them," Macri told us, "especially for the memory that's allocated for high-bandwidth needs like graphics."

Kaveri – Graphics Core Next

AMD's Graphics Core Next finally makes it onto a processor die along with CPU cores (click to enlarge)

Compared to discrete GPUs, a 128-bit-wide memory bus might seem – well, does seem – a bit paltry when compared with AMD's most powerful discrete GPU, which has a 512-bit bus. But as Macri points out in defense of the narrower path, Kaveri has just eight GPU cores to feed, whereas the hefty discrete-memory GPUs have more.

"We are a little light on memory bandwidth for graphics," he said, "but we're perfect, I think, on the compute side – or very close to being very well balanced on the compute side."

There are other reasons for the narrower memory bus, not the least being cost – both in terms of package cost and die real estate. A 64-bit memory channel uses about 118 pins for data, address, control, and clocks, he told us, and 0.8mm2 per byte is a good rule of thumb for additional die size. So if you wanted to add another 64-bit memory channel, you'd need to add about 6.5 to 7 mm2 of die real estate and more than 100 extra pins to the package, driving up both cost and size.

"Things just start adding up so that you can't afford it," he said. "And then in small-form factors, there's only enough room in here to have a 128-bit memory bus. And we really optimized Kaveri around ensuring that it can go into a small-form factor all the way up to the desktop. If we were only designing for desktop, I would have probably added another memory channel."

Macri told us that Kaveri's designers did "as much as possible to utilize that memory bandwidth as well as possible." For example, at the back end of the graphics pipes there's a local data share [PDF] between the different stages to reduce the need for having to go off-chip in search of data. In addition, he said, "GCN has a nice array of on-die buffers: L1 caches, L2 caches, local data stores. These all help."

Using blade systems to cut costs and sharpen efficiencies

More from The Register

next story
Report: American tech firms charge Britons a thumping nationality tax
Without representation, too. Time for a Boston (Lincs) Macbook Party?
iPad? More like iFAD: We reveal why Apple ran off to IBM
But never fear fanbois, you're still lapping up iPhones, Macs
Apple gets patent for WRIST-PUTER: iTime for a smartwatch
It does everything a smartwatch should do ... but Apple owns it
Apple orders huge MOUNTAIN of 80 MILLION 'Air' iPhone 6s
Bigger, harder trouser bulges foretold for fanbois
Child diagnosed as allergic to iPad
Apple's fondleslab is the tablet dermatitis sufferers won't want to take
Microsoft takes on Chromebook with low-cost Windows laptops
Redmond's chief salesman: We're taking 'hard' decisions
For Lenovo US, 8-inch Windows tablets are DEAD – long live 8-inch Windows tablets
Reports it's killing off smaller slabs are greatly exaggerated
prev story

Whitepapers

Seven Steps to Software Security
Seven practical steps you can begin to take today to secure your applications and prevent the damages a successful cyber-attack can cause.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
Designing a Defense for Mobile Applications
Learn about the various considerations for defending mobile applications - from the application architecture itself to the myriad testing technologies.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.
Consolidation: the foundation for IT and business transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.