Feeds

AMD reveals potent parallel processing breakthrough

Upcoming Kaveri processor will drink from shared-memory Holy Grail

Choosing a cloud hosting partner with confidence

AMD has released details on its implementation of The Next Big Thing in processor evolution, and in the process has unleashed the TNBT of acronyms: the AMD APU (CPU+GPU) HSA hUMA.

Before your eyes glaze over and you click away from this page, know that if this scheme is widely adopted, it could be of great benefit to both processor performance and developer convenience – and to you.

Simply put, what AMD's heterogeneous Uniform Memory Access (hUMA) does is allow central processing units (CPUs) and graphics processing units (GPUs) – which AMD places on a single die in their accelerated processing units (APUs) – to seamlessly share the same memory in a heterogeneous system architecture (HSA). And that's a very big deal, indeed.

Why? Simple. CPUs are quite clever, speedy, and versatile when performing complex tasks with myriad branches, but are less well-suited for the massively parallel tasks at which GPUs excel. Unfortunately, they can't currently share the same data in memory.

In today's CPU-GPU computing schemes, when a CPU senses that a process upon which it is working might benefit from a GPU's muscle, it has to copy the relevant data from its own reservoir of memory into the GPU's – and when the GPU is finished with its tasks, the results need to be copied back into the CPU's memory stash before the CPU can complete its work.

Needless to say, that back-and-forthing can consume a wasteful amount of clock cycles – and that's the limitation that AMD's upcoming Kaveri APU, scheduled to appear in the second half of this year, will overcome.

AMD's hUMA architecture: comparison of memory systems in CPU, APU, and APU with heterogeneous system architecture

With hUMA, CPU and GPU memory is united in one cache-coherent space (click to enlarge)

The secret sauce that Kaveri will bring to the computing party is hUMA, a scheme in which both CPU and GPU can share the same memory stash and the data within it, saving all those nasty copying cycles. hUMA is cache-coherent, as well – both CPU and GPU have identical pictures of what's what in both physical memory and cache, so if the CPU changes something, the GPU knows it's been changed.

Importantly, hUMA's shared memory pool extends to virtual memory, as well, which resides far away – relatively speaking – on a system's hard drive or SSD. The GPU does need to ask the CPU to tell the system's operating system to fetch the required data from virtual memory, but at least it can get what it wants, when it wants.

AMD's hUMA architecture: uniform memory access

In a hUMA system, the GPU can access the entire memory space, virtual memory included (click to enlarge)

At this point, you might well be asking, "All well and good, but what's in it for me?" Glad you asked.

From a user's point of view, hUMA will make CPU-GPU mashups – in AMD parlance, APUs – more efficient and snappier. Better efficiency should improve battery life and make hUMA-compliant processors more amenable to tablets and handsets. Snappier performance means, well, snappier performance.

From a developer's point of view, hUMA should make it significantly easier to create apps that can exploit the individual powers of CPUs and GPUs – and, for that matter, other specialized cores such as video accelerators and DSPs, since there's no compelling reason that they should be forever locked out of hUMA's heterogeneous system architecture party.

Developers shouldn't have much trouble – if any – exploiting hUMA, since AMD says it will be compatible with "mainstream programming languages," meaning Python, C++, and Java, "with no need for special APIs."

Also, it's important to note that although AMD was the company to make the hUMA announcement and will be the first to release a hUMA-compatible chip with Kaveri, the specification will be published by the HSA Foundation, of which AMD is merely one of many members along with fellow cofounders ARM, Imagination Technologies, Samsung, Texas Instruments, Qualcomm, and MediaTek. Should some – all? – of these HSA Foundation members adopt the shared-memory architecture, hUMA goodness could spread far and wide.

In fact, hUMAfication already appears to be on the way – and not necessarily where you might have first expected. AMD is supplying a custom processor for Sony's upcoming PlayStation 4, and in an interview this week with Gamasutra, PS4 chief architect Mark Cerny said that the console would have a "supercharged PC achitecture," and that "a lot of that comes from the use of the single unified pool of high-speed memory" available to both the CPU and GPU.

Sounds like hUMA, eh? ®

Beginner's guide to SSL certificates

More from The Register

next story
Ex-US Navy fighter pilot MIT prof: Drones beat humans - I should know
'Missy' Cummings on UAVs, smartcars and dying from boredom
Xperia Z3: Crikey, Sony – ANOTHER flagship phondleslab?
The Fourth Amendment... and it IS better
Don't wait for that big iPad, order a NEXUS 9 instead, industry little bird says
Google said to debut next big slab, Android L ahead of Apple event
Microsoft to enter the STRUGGLE of the HUMAN WRIST
It's not just a thumb war, it's total digit war
Netscape Navigator - the browser that started it all - turns 20
It was 20 years ago today, Marc Andreeesen taught the band to play
A drone of one's own: Reg buyers' guide for UAV fanciers
Hardware: Check. Software: Huh? Licence: Licence...?
The Apple launch AS IT HAPPENED: Totally SERIOUS coverage, not for haters
Fandroids, Windows Phone fringe-oids – you wouldn't understand
Apple SILENCES Bose, YANKS headphones from stores
The, er, Beats go on after noise-cancelling spat
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Cloud and hybrid-cloud data protection for VMware
Learn how quick and easy it is to configure backups and perform restores for VMware environments.
Three 1TB solid state scorchers up for grabs
Big SSDs can be expensive but think big and think free because you could be the lucky winner of one of three 1TB Samsung SSD 840 EVO drives that we’re giving away worth over £300 apiece.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.