Feeds

Nvidia shows off superjuiced Kepler GPU

From workhouse to racehorse

Application security programs and practises

HPC blog There were quite a few surprises in today’s GTC12 keynote by NVIDIA CEO and co-founder Jen-Hsun Huang.

If NVIDIA were just introducing a new and faster rev of its latest GPU processor, one that brings three times the performance without breaking the bank on energy usage, that would be a solid win, and in line with expectations. But there was more to this announcement – much more. Our buddy TPM gives the down-and-dirty details on Kepler here.

I’m not sure this is exactly the right analogy, but to me, what NVIDIA has done with Kepler is transform the GPU from a simple task-worker into a much more productive member of the knowledge working class. But Kepler isn’t a paper-shuffling, PowerPoint-wielding MBA type: it still has a solid work ethic, outperforming predecessor Fermi by more than three times. More important than performance, however, is Kepler’s sophistication in processing work.

The features I’m talking about below apply to the Kepler K20, the dual GPU behemoth that’s due in Q4 2012. You’ll also need to be using the new version of CUDA, since it contains the instructions to take advantage of these new capabilities.

The first new feature is something called Hyper-Q. With Hyper-Q, a Kepler GPU can now accept work from up to 32 CPU cores simultaneously. Before Hyper-Q, only one CPU core at a time could dispatch work to the GPU, which meant that there were long stretches of time when the GPU would sit idle while waiting for more tasks from whichever CPU core it was working with.

With Hyper-Q, the GPU is now a full-fledged team player in the system, able to accept work from many cores at the same time. This will drive GPU utilisation up, of course, but it will also push CPU utilisation higher as more CPU cores at a time can dispatch and receive work from the GPUs.

The next new wrinkle is something called Dynamic Parallelism, a feature that will also serve to radically increase overall processing speed and system utilisation while reducing programming time and complexity.

Today, without Dynamic Parallelism, GPUs are very fast, but they’re limited in what they can do on their own. Lots of routines are recursive or data dependent, meaning that the results from one set of steps or calculations dictate what happens in the next set of steps or calculations. GPUs can run through these calculations very fast, but then they have to ship the results out to the CPU and wait for further instructions. The CPU then evaluates the results and gives the GPUs another set of tasks to do – perhaps run the same calculations with new data or different assumptions.

But with Dynamic Parallelism, GPUs can now run recursive loops right on the GPU – no need to run back to the CPU for instructions. Kepler can run almost limitless loops, cranking through calculation after calculation using thousands of cores. It can spawn new processes and new processing streams without having to depend on the CPU to give it directions.

Taking advantage of Dynamic Parallelism will obviously result in higher efficiency and utilisation as highly parallelised work is performed on speedy GPUs, leaving CPUs either free to perform other work or to simply stand quietly off to one side.

I’m not a programmer by any stretch of the imagination – that’s probably obvious. But from what little experience I have, buttressed by conversations with real programmers, it’s clear that using Dynamic Parallelism will also make the CUDA programmer’s job much easier. According to NVIDIA, programming jobs that used to take 300 steps can now be accomplished with as few as 20, because they don’t have to code all of the back-and-forth traffic between CPUs and GPUs.

Just Hyper-Q and Dynamic Parallelism on their own are pretty big steps in the evolution of the GPU and hybrid computing. With the addition of these two features, the GPU is now able to be shared by an entire system, rather than just a single core, and it’s able to generate its own workload and complete much more of that workload without needing to be led through it by a slower, general-purpose CPU.

Before the Kepler K20, the CPU’s role in a hybrid system was mainly as a traffic cop – responsible for sending traffic (data and tasks) to the GPU and accepting the results. With Kepler and its advanced feature set, the GPU can now work for 32 different cops at the same time and manage a larger part of the overall job on its own. This gives the cops more time to handle other tasks, write some parking tickets, or just pull their hats down over their eyes and catch a nap. ®

Bridging the IT gap between rising business demands and ageing tools

More from The Register

next story
Auntie remains MYSTIFIED by that weekend BBC iPlayer and website outage
Still doing 'forensics' on the caching layer – Beeb digi wonk
Attack of the clones: Oracle's latest Red Hat Linux lookalike arrives
Oracle's Linux boss says Larry's Linux isn't just for Oracle apps anymore
THUD! WD plonks down SIX TERABYTE 'consumer NAS' fatboy
Now that's a LOT of porn or pirated movies. Or, you know, other consumer stuff
Apple fanbois SCREAM as update BRICKS their Macbook Airs
Ragegasm spills over as firmware upgrade kills machines
EU's top data cops to meet Google, Microsoft et al over 'right to be forgotten'
Plan to hammer out 'coherent' guidelines. Good luck chaps!
US judge: YES, cops or feds so can slurp an ENTIRE Gmail account
Crooks don't have folders labelled 'drug records', opines NY beak
Manic malware Mayhem spreads through Linux, FreeBSD web servers
And how Google could cripple infection rate in a second
prev story

Whitepapers

Top three mobile application threats
Prevent sensitive data leakage over insecure channels or stolen mobile devices.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Top 8 considerations to enable and simplify mobility
In this whitepaper learn how to successfully add mobile capabilities simply and cost effectively.
Application security programs and practises
Follow a few strategies and your organization can gain the full benefits of open source and the cloud without compromising the security of your applications.
The Essential Guide to IT Transformation
ServiceNow discusses three IT transformations that can help CIO's automate IT services to transform IT and the enterprise.