Feeds

Deep inside AMD's master plan to topple Intel

Back to the top on a radical GPU

Internet Security Threat Report 2014

The heterogeneous future

GNC's goal is twofold: simplify the programming model and make the GPU core more capable of participating in what AMD, ARM, Microsoft and others call "heterogenous computing" – that is, distributing work among CPU, GPU, and more-specialized cores, which each element contributing what it does best.

The major change in the GCN's shader array is that it includes what AMD calls the compute unit (CU), and what Demers calls the "cellular basis" of the design. A CU takes over the chores of the previous architecture's VLIW-based SIMD (single-instruction-stream, multiple-data-stream) elements.

VLIW is gone. The GCN's CUs are fundamentally vector cores containing multiple SIMD structures, programmed in a per-lane basis. Four groups of wavefronts are run in each CU core per cycle. "It's a vector core where each lane is programmed independently, and there's a single stream coming in and broadcast all over those things," Demers says. "You program it in a scalar way, and it operates in a vector mode."

Simply put, a CU might be considered to be a smart VLIW/SIMD structure. In the VLIW world, you'd have to rely on the compiler to load the core correctly and efficiently. If something changes in the instruction stream, the VLIW is too dumb to modify its workload, and pipes might remain unfilled with data, wasting cycles.

As you might guess, that makes VLIW perfectly fine for graphics, where predictability is high, but crappy for compute, where dependencies can and do change at a moment's notice – even if that "moment" is a billionth of a second. Although the CU must work wavefront by wavefront – it's not an out-of-order mind-reader – it can move workloads around radically more nimbly than VLIW.

Core reasoning

This versatility is the – pardon the pun – core reason for the GCN: AMD is planning for a heterogeneous world, in which GPUs are increasingly equal compute partners with CPUs.

AMD Fusion Summit 2011 keynote presentation slide: 'Evolution of AMD's Graphics Core, and Preview of Graphics Core Next'

Is the GCN and its CUs a MIMD, SIMD, or SMT architecture? Yes (click to enlarge)

The CUs can work in virtual space, Demers says, and they'll support the x86 64-bit virtual address space – more on that later. Also, the CUs are supported by a much larger L1 data cache than was in the previous architecture. The cache also has what Demers calls "a significant amount of bandwidth," and is supported by its own control system.

Previous AMD GPU architectures have had what the company has called "hidden fixed-function with hidden state". As examples of such fixed functions, Demers identifies "program counter advancements, and things such as that – limited functionality."

Help with the housekeeping

The GCN moves beyond hidden fixed functions with the addition of a fully observable scalar processor, which frees the CUs from simple tasks – quick math functions, for example, and housekeeping. "It's a processor in its own right," says Demers, and it's responsible for such common code as branching code and common pointers. A vector unit could also handle such common-code chores, but as Demers explains: "The scalar coprocessor helps it out, and offloads those capabilities."

Observability of the CUs and the scalar processor, and support for the x86 virtual space – along with the fact that, Demers says, "you can load the PC from memory or from a register and do all kinds of math" – opens up such C++ features as virtual functions, recursions, and x86 dynamic linked libraries. "All of these become a native thing that this guy can support," he says.

AMD Fusion Summit 2011 keynote presentation slide: 'Evolution of AMD's Graphics Core, and Preview of Graphics Core Next'

Shrinking processes enable more stuff to be stuffed on a chip – so let's add a scalar processor (click to enlarge)

The processing capability boosted by a host of compute units is all well and good, but only if they can be fed the right data to munch on at the right time. To this end, the GCN architecture allows for multiple command streams from multiple applications, each with different priorities and the ability to reserve CUs for themselves.

As an example of this capability, Demers suggests the interaction of your operating system's user interface and an app. "You can have your GUI running at one priority level, and you can set that high, and you can guarantee some amount of compute units always available for it. But then your big background applications for transcode can be running at a lower priority," he says, and you will still have a great quality of service [QoS] – there's no more skipping mouse when you do a big job, because the big job is running in a separate queue."

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
Chipmaker FTDI bricking counterfeit kit
USB-serial imitators whacked by driver update
Xperia Z3: Crikey, Sony – ANOTHER flagship phondleslab?
The Fourth Amendment... and it IS better
DOUBLE BONK: Testy fanbois catch Apple Pay picking pockets
Users wail as tapcash transactions are duplicated
Microsoft to enter the STRUGGLE of the HUMAN WRIST
It's not just a thumb war, it's total digit war
Google Glassholes are UNDATEABLE – HP exec
You need an emotional connection, says touchy-feely MD... We can do that
FTDI yanks chip-bricking driver from Windows Update, vows to fight on
Next driver to battle fake chips with 'non-invasive' methods
prev story

Whitepapers

Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
New hybrid storage solutions
Tackling data challenges through emerging hybrid storage solutions that enable optimum database performance whilst managing costs and increasingly large data stores.