AMD muscles Nvidia with fanless GPU coprocessors

Anything you can do, we can do. Except ECC

Securing Web Applications Made Simple and Scalable

Keeping pace with Nvidia in the GPU wars, Advanced Micro Devices has not only launched its "Lisbon" Opteron 4100 processors but also released the embedded versions of its "Cypress" family of GPUs, a counterpunch to Nvidia's "Fermi" chips used in its Tesla embedded GPUs.

The Cypress GPUs already made their way into the ATI Radeon HD 5870 discrete graphics cards (last October and the ATI FirePro V8800 graphics cards for high-end workstations (back in April). Today, the Cypress GPUs will be plunked into the third generation of FireStream GPU coprocessors intended for embedded applications where the GPUs do complex math that an x64 can't do without both taking its shoes off and pulling its pants down (if it is male) or lifting its shirt up (if it is female).

The Cypress GPU is no slouch, just like Nvidia's Fermi GPUs — and just like Intel and AMD are fierce competitors that get the best of each other every now and again, the competition between AMD and Nvidia drives innovation forward. The Cypress GPU gets the normal fan-cooled packaging for the Radeon HD and FirePro discrete graphics cards, with the major difference being that the FirePro cards has more video memory. With the FireStream GPU co-processors, the units are equipped with a passive heat sink that allows them to slide into rack and tower servers, creating the hybrid x64-GPU systems that many think will soon become the norm in the HPC arena.

Here's the block diagram laying out the Cypress GPU components:

AMD's Cypress GPU

The Cypress chip has 1,600 SIMD engines and a slew of supporting electronics wrapped around them so they can do math with their clothing still intact. The AMD GPU has full support for the DirectCompute 11 and OpenCL 1.0 graphics and number-crunching protocols embedded in its hardware, and also includes 32-bit atomic operations, flexible 32KB local data shares, 64KB global data shares, global synchronization, and append/consume buffers etched onto its silicon.

With all of its cores working properly, the Cypress GPU can deliver 2.72 teraflops of single-precision and 544 gigaflops of double-precision floating point performance. While there are some workloads that can use single-precision just fine (some life sciences and oil and gas exploration apps are fine with single precision), most flop heads care about double-precision. And in this case, the ATI Cypress GPU can hold its own against the best Fermi that Nvidia has. However, Nvidia makes much about the fact that the ATI GPU does not have error correction on its cores and GDDR memory — and AMD acknowledges that's a feature it needs to add.

Double-precision math is more interesting to a lot of organizations looking to do more flops. The first FireStream embedded GPUs, from October 2006, were glorified Radeon X19XX GPUs with only single-precision math. The FireStream 9170s hit 500 single-precision gigaflops and added double-precision math — albeit substantially less than you might expect.

In the summer of 2008, ATI kicked out the FireStream 9250 (1 teraflops SP and 200 gigaflops DP) and 9270 (1.2 teraflops SP and 240 gigaflops SP) embedded GPUs. The 9250s were single-slot devices with 1GB of GDDR3 graphics memory rated at under 120 watts, while the 9270s were double-slotters with 2GB of faster GDDR5 memory rated at 160 watts. These units have fans, which screw up the airflow inside of servers and therefore limited their ability to be adopted in HPC clusters. That's why both Nvidia and AMD are going with passive heat sinks with their latest embedded GPUs.

The new entry-level embedded AMD GPU, the FireStream 9350, is the one to go for if you're looking for the best way to put the most flops in a box. With 2GB of GDDR5 graphics memory, 2 teraflops SP and 400 gigaflops DP performance, it is basically twice the GPU of its predecessor, the FireStream 9250. The FireStream 9350 has 1,440 of its SIMD engine cores working — presumably the other 160 are duds — and runs at 700MHz with a memory clock of 1GHz.

AMD FireStream 9350 Embedded GPU

The AMD FireStream 9350 Embedded GPU

At 150 watts, the 9350 embedded GPU runs a little hotter than its predecessor, but an extra 30 watts or so to double the performance is a very good Moore's Law trade-off. And equally importantly, the FireStream 9350, at $799, is cheaper than the 9250 GPU, which cost $999. A teraflops of the FireStream 9250 cards would run you just under $5,000, and with the 9350 GPUs, you're talking just under $2,000 per teraflops.

The Essential Guide to IT Transformation

Next page: Double-wide GPU

More from The Register

next story
EU's top data cops to meet Google, Microsoft et al over 'right to be forgotten'
Plan to hammer out 'coherent' guidelines. Good luck chaps!
Manic malware Mayhem spreads through Linux, FreeBSD web servers
And how Google could cripple infection rate in a second
FLAPE – the next BIG THING in storage
Find cold data with flash, transmit it from tape
Seagate chances ARM with NAS boxes for the SOHO crowd
There's an Atom-powered offering, too
Intel teaches Oracle how to become the latest and greatest Xeon Whisperer
E7-8895 v2 chips are best of the bunch, and with firmware-unlocked speed control
Gartner: To the right, to the right – biz sync firms who've won in a box to the right...
Magic quadrant: Top marks for, er, completeness of vision, EMC
prev story


Top three mobile application threats
Prevent sensitive data leakage over insecure channels or stolen mobile devices.
The Essential Guide to IT Transformation
ServiceNow discusses three IT transformations that can help CIO's automate IT services to transform IT and the enterprise.
Mobile application security vulnerability report
The alarming realities regarding the sheer number of applications vulnerable to attack, and the most common and easily addressable vulnerability errors.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Consolidation: the foundation for IT and business transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.