Feeds

AMD's 'Revolution' will be televised ... if its CPU-GPU frankenchip Kaveri is a hit

Graphics and x86 cores are all just 'compute units' now for our games, videos and apps

The Essential Guide to IT Transformation

On the die, memory matters

For one thing, Macri told The Reg in a sit-down after the group briefing, AMD has doubled the size of the branch target buffer in Kaveri over that of its predecessor, Richland.

"It also has one new cute feature in that when it misses in the [instruction] cache now, we kick off a prefetch," he said. "And prefetches are neat in that if they're 100 per cent accurate, then it's super-cool. Getting a memory reference out early helps the memory controller slide it into a place that's idle."

Improving the prefetching of instructions from RAM helps further alleviate the pain of an instruction cache miss: this happens when a core attempts to touch program code outside the cache, forcing it to stall while waiting for the memory controller to copy fresh instructions into the cache. However, by being a bit smarter, "when you shoot a prefetch out, it's like, 'hey, you can take a little longer'," he said.

Such tweaks have allowed Kaveri's designers to boost the CPU's overall instructions-per-clock (IPC) performance by helping out the memory subsystem – which, by the way, has a clock rate of 2400MHz, up from Richland's 2133MHz.

And speaking of memory, Kaveri has two 64-bit, fully independent memory channels. "We do stripe across them," Macri told us, "especially for the memory that's allocated for high-bandwidth needs like graphics."

Kaveri – Graphics Core Next

AMD's Graphics Core Next finally makes it onto a processor die along with CPU cores (click to enlarge)

Compared to discrete GPUs, a 128-bit-wide memory bus might seem – well, does seem – a bit paltry when compared with AMD's most powerful discrete GPU, which has a 512-bit bus. But as Macri points out in defense of the narrower path, Kaveri has just eight GPU cores to feed, whereas the hefty discrete-memory GPUs have more.

"We are a little light on memory bandwidth for graphics," he said, "but we're perfect, I think, on the compute side – or very close to being very well balanced on the compute side."

There are other reasons for the narrower memory bus, not the least being cost – both in terms of package cost and die real estate. A 64-bit memory channel uses about 118 pins for data, address, control, and clocks, he told us, and 0.8mm2 per byte is a good rule of thumb for additional die size. So if you wanted to add another 64-bit memory channel, you'd need to add about 6.5 to 7 mm2 of die real estate and more than 100 extra pins to the package, driving up both cost and size.

"Things just start adding up so that you can't afford it," he said. "And then in small-form factors, there's only enough room in here to have a 128-bit memory bus. And we really optimized Kaveri around ensuring that it can go into a small-form factor all the way up to the desktop. If we were only designing for desktop, I would have probably added another memory channel."

Macri told us that Kaveri's designers did "as much as possible to utilize that memory bandwidth as well as possible." For example, at the back end of the graphics pipes there's a local data share [PDF] between the different stages to reduce the need for having to go off-chip in search of data. In addition, he said, "GCN has a nice array of on-die buffers: L1 caches, L2 caches, local data stores. These all help."

Build a business case: developing custom apps

More from The Register

next story
4K video on terrestrial TV? Not if the WRC shares frequencies to mobiles
Have your say with Ofcom now, before Freeview becomes Feeview
iPad? More like iFAD: We reveal why Apple fell into IBM's arms
But never fear fanbois, you're still lapping up iPhones, Macs
Sonos AXES support for Apple's iOS4 and 5
Want to use your iThing? You can't - it's too old
You didn't get the MeMO? Asus Pad 7 Android tab is ... not bad
Really, er, stands out among cheapie 7-inchers
Apple winks at parents: C'mon, get your kid a tweaked Macbook Pro
Cheapest models given new processors, more RAM
Leaked Windows Phone 8.1 Update specs tease details of Nokia's next mobes
New screen sizes, dual SIMs, voice over LTE, and more
YES, iPhones ARE getting slower with each new release of iOS
Old hardware doesn't get any faster with new software
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
The Essential Guide to IT Transformation
ServiceNow discusses three IT transformations that can help CIO's automate IT services to transform IT and the enterprise.
Maximize storage efficiency across the enterprise
The HP StoreOnce backup solution offers highly flexible, centrally managed, and highly efficient data protection for any enterprise.