Feeds

AMD's 'Revolution' will be televised ... if its CPU-GPU frankenchip Kaveri is a hit

Graphics and x86 cores are all just 'compute units' now for our games, videos and apps

Secure remote control for conventional and virtual desktops

On the die, memory matters

For one thing, Macri told The Reg in a sit-down after the group briefing, AMD has doubled the size of the branch target buffer in Kaveri over that of its predecessor, Richland.

"It also has one new cute feature in that when it misses in the [instruction] cache now, we kick off a prefetch," he said. "And prefetches are neat in that if they're 100 per cent accurate, then it's super-cool. Getting a memory reference out early helps the memory controller slide it into a place that's idle."

Improving the prefetching of instructions from RAM helps further alleviate the pain of an instruction cache miss: this happens when a core attempts to touch program code outside the cache, forcing it to stall while waiting for the memory controller to copy fresh instructions into the cache. However, by being a bit smarter, "when you shoot a prefetch out, it's like, 'hey, you can take a little longer'," he said.

Such tweaks have allowed Kaveri's designers to boost the CPU's overall instructions-per-clock (IPC) performance by helping out the memory subsystem – which, by the way, has a clock rate of 2400MHz, up from Richland's 2133MHz.

And speaking of memory, Kaveri has two 64-bit, fully independent memory channels. "We do stripe across them," Macri told us, "especially for the memory that's allocated for high-bandwidth needs like graphics."

Kaveri – Graphics Core Next

AMD's Graphics Core Next finally makes it onto a processor die along with CPU cores (click to enlarge)

Compared to discrete GPUs, a 128-bit-wide memory bus might seem – well, does seem – a bit paltry when compared with AMD's most powerful discrete GPU, which has a 512-bit bus. But as Macri points out in defense of the narrower path, Kaveri has just eight GPU cores to feed, whereas the hefty discrete-memory GPUs have more.

"We are a little light on memory bandwidth for graphics," he said, "but we're perfect, I think, on the compute side – or very close to being very well balanced on the compute side."

There are other reasons for the narrower memory bus, not the least being cost – both in terms of package cost and die real estate. A 64-bit memory channel uses about 118 pins for data, address, control, and clocks, he told us, and 0.8mm2 per byte is a good rule of thumb for additional die size. So if you wanted to add another 64-bit memory channel, you'd need to add about 6.5 to 7 mm2 of die real estate and more than 100 extra pins to the package, driving up both cost and size.

"Things just start adding up so that you can't afford it," he said. "And then in small-form factors, there's only enough room in here to have a 128-bit memory bus. And we really optimized Kaveri around ensuring that it can go into a small-form factor all the way up to the desktop. If we were only designing for desktop, I would have probably added another memory channel."

Macri told us that Kaveri's designers did "as much as possible to utilize that memory bandwidth as well as possible." For example, at the back end of the graphics pipes there's a local data share [PDF] between the different stages to reduce the need for having to go off-chip in search of data. In addition, he said, "GCN has a nice array of on-die buffers: L1 caches, L2 caches, local data stores. These all help."

Providing a secure and efficient Helpdesk

More from The Register

next story
TEEN RAMPAGE: Kids in iPhone 6 'Will it bend' YouTube 'prank'
iPhones bent in Norwich? As if the place wasn't weird enough
iPAD-FONDLING fanboi sparks SECURITY ALERT at Sydney airport
Breaches screening rules cos Apple SCREEN ROOLZ, ok?
Crouching tiger, FAST ASLEEP dragon: Smugglers can't shift iPhone 6s
China's grey market reports 'sluggish' sales of Apple mobe
Apple's new iPhone 6 vulnerable to last year's TouchID fingerprint hack
But unsophisticated thieves need not attempt this trick
The British Museum plonks digital bricks on world of Minecraft
Institution confirms it's cool with joining the blocky universe
How the FLAC do I tell MP3s from lossless audio?
Can you hear the difference? Can anyone?
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.
The next step in data security
With recent increased privacy concerns and computers becoming more powerful, the chance of hackers being able to crack smaller-sized RSA keys increases.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.