Feeds

Nvidia's Fermi hits flop-hungry challengers

HPC players tool up

Application security programs and practises

Nvidia's Fermi graphics coprocessors have begun shipping through its OEM partner channel with a slew of tier-two players hoping the flop-happy GPUs give them a competitive edge against established players in the HPC server racket.

The Fermi graphics cards and GPU coprocessors that are based on them were both previewed last November at the SC09 supercomputing conference. The Fermi graphics chips previewed had 512 cores, but for reasons that Nvidia has not explained - and which probably involve chip yields and heating issues - the GeForce graphics cards and Tesla 20 coprocessors that have started shipping only have 448 working cores. And that means their floating-point performance is a little lower than expected.

The Tesla coprocessors are implemented in three different form factors, which was not apparent at the launch last November. The C series GPU coprocessors have fans on them and plug into workstation and personal supercomputers (basically, an x64 workstation on steroids); the M series, which are fanless units that are intended to be used in hybrid CPU-GPU setups within the same chassis; and the S series, which are GPU appliances that plug into servers through external PCI Express links and pack up to four GPUs into a 1U chassis.

Back in November, Nvidia was saying that the C2050 and the C2070, which had an initial rating of 520 and 630 gigaflops doing double-precision math and which cost $2,499 and $3,999, respectively, would support the 512-core Fermi chips. In early April, Nvidia started shipping the C2050, but with only 448 cores and rated at 515 gigaflops double-precision, and the C2070 was pushed out to the third quarter. It's a fair guess that with the number of cores dropping by 12.5 per cent in the C2050 but the aggregate performance of the GPU coprocessor only dropping by one per cent, Nvidia cranked up the clock speed to make up for the lower GPU core count.

There were to be two variations of the S series GPU appliances, the S2050 appliance using the C2050 GPUs, rated at 2.08 teraflops and costing $12,995, and the S2070 appliance using the faster C2070 GPUs rated at 2.52 teraflops and costing $18,995. The S series boxes aren't shipping yet, and they will be based on the 448-core C series GPUs, likely providing a little less floppy oomph. Sources at Nvidia say that the S series GPU appliances are still on track for delivery this quarter.

Nvidia started peddling the Fermi GPUs in its GeForce graphics card lineup during the first quarter.

The news today is that the Tesla M2050 embedded GPU coprocessor, which is based on the C2050 card as the name suggests and which is rated at the same 515 gigaflops of double-precision and 1.03 teraflops single-precision floating point performance, has begun shipping through OEM server partners. Appro and Super Micro were the first to announce systems using the M series GPUs. (You have to hunt around the Nvidia site to find the M2050 spec sheet, so let me save you the trouble.)

Oak Ridge boys

Nvidia planned to host a big shindig in Washington DC kicking off the M series, with Oak Ridge National Laboratory talking about how hybrid CPU-GPU systems were the wave of the future, and Georgia Tech, which has a project called Keeneland for creating applications that run on the hybrid CPU-GPU, giving presentations.

Oak Ridge is, of course, one of the first big customers for the Fermi GPUs. Last October, before the Fermi GPU coprocessors were unveiled by Nvidia at SC09 but after the Fermi chips on which they are based were detailed, the Cray XT "Jaguar" massively parallel Opteron super at Oak Ridge weighed in at 1.06 petaflops using the Linpack Fortran benchmark test as a gauge. Shortly thereafter, the upgraded Jaguar machine was pushed to 1.76 petaflops by the addition of new Opteron cores.

The only reason this matters is that in early October last year, Oak Ridge said that it would be building a hybrid CPU-GPU super based on Nvidia cards that would have at least ten times the oomph of Jaguar. Most likely meaning breaking the 10 petaflops barrier, but not the 20 petaflops barrier. Oak Ridge was intentionally vague, and perhaps because it was unsure of what the performance of such a hybrid machine might be.

There is also a rumor going around that Oak Ridge was unhappy about the performance of the Nvidia Tesla 20 GPUs and has canceled the project, but Nvidia says this is untrue. Oak Ridge has yet to say exactly what it is building.

Eight steps to building an HP BladeSystem

Next page: Apropos Appro

More from The Register

next story
Sysadmin Day 2014: Quick, there's still time to get the beers in
He walked over the broken glass, killed the thugs... and er... reconnected the cables*
SHOCK and AWS: The fall of Amazon's deflationary cloud
Just as Jeff Bezos did to books and CDs, Amazon's rivals are now doing to it
Amazon Reveals One Weird Trick: A Loss On Almost $20bn In Sales
Investors really hate it: Share price plunge as growth SLOWS in key AWS division
US judge: YES, cops or feds so can slurp an ENTIRE Gmail account
Crooks don't have folders labelled 'drug records', opines NY beak
Auntie remains MYSTIFIED by that weekend BBC iPlayer and website outage
Still doing 'forensics' on the caching layer – Beeb digi wonk
BlackBerry: Toss the server, mate... BES is in the CLOUD now
BlackBerry Enterprise Services takes aim at SMEs - but there's a catch
The triumph of VVOL: Everyone's jumping into bed with VMware
'Bandwagon'? Yes, we're on it and so what, say big dogs
Carbon tax repeal won't see data centre operators cut prices
Rackspace says electricity isn't a major cost, Equinix promises 'no levy'
prev story

Whitepapers

Top three mobile application threats
Prevent sensitive data leakage over insecure channels or stolen mobile devices.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Designing a Defense for Mobile Applications
Learn about the various considerations for defending mobile applications - from the application architecture itself to the myriad testing technologies.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.