Nvidia snaps out snappier Tesla GPU coprocessors

All fired up on all 512 cores

Next gen security for virtualised datacentres

GPU chipmaker Nvidia knows that it has to do more to grow its Tesla biz than slap some passive heat sinks on a fanless GPU card and talk up its CUDA parallel-programming tools. It has to keep delivering price/performance improvements, as well.

And that's exactly what it's doing with the new Tesla M2090 GPU coprocessor.

Back when the "Fermi" GPU chips were previewed at the SC2009 supercomputing event a year and a half ago, Nvidia showed off a chip with 512 cores, plus L1 and L2 cache memories for those cores (this was new) and ECC memory scrubbing (also new). The design bundled up 16 sets of 32 cores each into a streaming multiprocessor with 64KB of L1 cache, and a higher level L2 cache weighing in at 768KB that the cores can share.

That Fermi chip sported GDDR5 memory controllers, and the cards using the Fermi chips (either as discrete graphics cards or GPU coprocessors for accelerating floating point calculations) could have 3GB or 6GB of main memory. The memory controllers on the Fermi GPUs can address up to 1TB of memory, in theory.

But in the chip racket, theory does not always happen on the first iteration of a product, and so it was with the Fermi GPUs.

When the Fermi chips started shipping in the Tesla line of GPU coprocessors in May 2010, the initial Teslas had only 448 cores activated. Nvidia never explained this, but most people surmised that this had to do with yield issues (gunk on some cores in the chip) and the chips generating too much heat at a particular clock speed.

With those 448 cores running at 1.15GHz and GDDR5 memory chips running at 1.56GHz, the Tesla M2050 GPU coprocessor was rated at the 515 gigaflops of double-precision and 1.03 teraflops single-precision when performing floating-point operations.

The Tesla M2050 is a single-wide PCI-Express 2.0 device that has 3GB of GDDR5 memory, while the M2070 is a two-slot device that packs 6GB of memory and has the same floppish performance.

Both are rated at a top-end 225 watts of peak power draw, but Nvidia says the actual heat thrown off by the device is often a lot less and depends on the workload. That is a little bit less than 238 watts that the Tesla C2050 and C2070 coprocessors, which have fans built into them and which are aimed at goosing the number-crunching power of workstations to create a "personal supercomputer" – although these devices, too, are rated at the same 515 gigaflops of double-precision and 1.03 teraflops single-precision.

Sumit Gupta, senior product manager of the Tesla line at Nvidia, says that the Fermi GPUs used in the new M2090 coprocessors are not just a bin sort, looking for Fermis with more working cores or clocks that can run faster reliably. Nvidia has actually done a new tape-out of the Fermi design using Taiwan Semiconductor Manufacturing Corp's 40-nanometer processes, which Gupta says have some improvements that make chips run better.

When you add up some nips and tucks here and there on the Fermi chip plus the process improvements from TSMC, Nvidia can crank up the Fermi core clock speed by 13 per cent to 1.3GHz, and the GDDR5 memory speed by 18.6 per cent, to 1.85GHz, on the Tesla M2090.

Nvidia Tesla M2090 GPU

Nvidia's Tesla M2090 server GPU coprocessor

Those increases help performance considerably. And so does the fact that with the TSMC process improvement, Nvidia can now have all 512 cores in the Fermi design activated, which yields a theoretical 14.3 per percent improvement over those initial Fermi chips with only 448 active cores.

Gartner critical capabilities for enterprise endpoint backup

Next page: Do the math

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Shoot-em-up: Sony Online Entertainment hit by 'large scale DDoS attack'
Games disrupted as firm struggles to control network
Cutting cancer rates: Data, models and a happy ending?
How surgery might be making cancer prognoses worse
Silicon Valley jolted by magnitude 6.1 quake – its biggest in 25 years
Did the earth move for you at VMworld – oh, OK. It just did. A lot
Forrester says it's time to give up on physical storage arrays
The physical/virtual storage tipping point may just have arrived
prev story


Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
5 things you didn’t know about cloud backup
IT departments are embracing cloud backup, but there’s a lot you need to know before choosing a service provider. Learn all the critical things you need to know.
Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
Top 8 considerations to enable and simplify mobility
In this whitepaper learn how to successfully add mobile capabilities simply and cost effectively.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?