Nvidia launches not one but two Kepler2 GPU coprocessors

Uncloaks Tesla K20, K20X extreme oomphers for servers, workstations

Securing Web Applications Made Simple and Scalable

Putting the K20X through the HPC paces

To test the relative efficiency of the Fermi and Kepler generations of GPUs, Nvidia grabbed a two-socket server with two Intel Xeon E5-2680 processors spinning at 2.7GHz, and dropped in two Fermi M2090 GPU coprocessors and then ran the Linpack Fortran benchmark on the box.

This setup delivers 1.03 teraflops of sustained Linpack performance, with a computational efficiency of 61 per cent – meaning that 39 per cent of the aggregate double-precision floating point performance of the system went up the chimney.

Nvidia then took the same server and yanked out the M2090s and slotted in two K20X coprocessors. The server was able to deliver 2.25 teraflops of sustained Linpack performance, and not just because the K20X is more powerful, but because the K20X is more efficient. In fact, 76 per cent of the aggregate performance in the server is actually brought to bear on the Linpack test thanks to the architectural changes in the Tesla K20 series of coprocessors.

How the K20 stacks up against a Xeon E5 and a Fermi GPU

How the K20 stacks up against a Xeon E5 and a Fermi GPU

In another test to show how the GPU coprocessors stack up against – rather than with – Intel Xeons, Nvidia fired up the DGEMM double-precision matrix math benchmark on an eight-core Xeon E5-2687, which is the 3.1GHz chip made for workstations, which was able to do 170 gigaflops.

A Fermi-based M2090 could do 430 gigaflops, and the Kepler-based K20X could do 1.22 teraflops. This test is important in that the DGEMM test is what Intel used to show a prototype Xeon Phi x86-based parallel coprocessor breaking through 1 teraflops on a single prototype card a year ago at SC11.

The K20 versus Xeons on various scientific apps

The K20 versus Xeons on various scientific apps

GPU accelerators are not just about servers, but also about workstations. Nvidia has spent some time in the labs running real workloads on Xeon or Core i7 workstations and seeing what happens when Tesla K20 or K20X coprocessors are added to the workstation.

On the MATLAB application shown in the chart above, a workstation with one i7-2600K processor ran some fast Fourier transform (FFT) routines, and then the same routines were run after slapping in a Tesla K20 coprocessor. The speedup was a factor of 18 because the MATLAB software speaks CUDA and the work lends itself to offloading to the GPU.

For the other tests, Nvidia used a workstation with two top-bin E5-2687W processors paired with two Tesla K20X chips, and the speedup for various applications ranged from a low factor of 8X to a high of 32X.

Adding K20X coprocessors to Cray supers speeds up apps big time

Adding K20X coprocessors to Cray supers speeds up apps big time

Nvidia and supercomputer partner Cray are obviously very keen to demonstrate that packaged applications can scale across hundreds or thousands of server nodes equipped with GPU accelerators, and chose to pit the QMCPACK materials-science application and the NAMD molecular-dynamics application through the paces on a Cray XK7 system both with and without K20X GPU accelerators installed.

The tests show that the GPU accelerators not only can speed up calculations with these two applications, but that as you boost the server node count in the XK7 machine – which uses Cray's "Gemini" 3D torus interconnect to hook nodes to each other – the performance of the ceepie-geepie box scales further faster than the CPU-only machine.

Gupta says that Nvidia has already shipped 30 petaflops worth of Tesla K20 and K20X coprocessors in the past 30 days. The K20 card will be available through workstation and server makers and through the retail channel where you normally buy graphics cards and other gear. The K20X, which is a fanless design, will be like the Tesla M2090 fanless coprocessor before it, and will only be available through server OEMs who tweak their machines to allow them to do the cooling for the Kepler cards. The channel will be getting K20 cards from Nvidia in the middle of this month in volume, with the server OEMs having the K20X cards available in November or December, depending on the OEM.

Pricing is not available on the new units, but El Reg estimates that the Tesla K20 card probably costs something on the order of $3,000 to $3,500 street, with the K20X commanding perhaps $500 to $1,000 more than that. At that price, two K20X cards will just about triple the cost of a server node, but will offer considerably more performance on workloads, as you can see from the data above. ®

The Essential Guide to IT Transformation

More from The Register

next story
Manic malware Mayhem spreads through Linux, FreeBSD web servers
And how Google could cripple infection rate in a second
EU's top data cops to meet Google, Microsoft et al over 'right to be forgotten'
Plan to hammer out 'coherent' guidelines. Good luck chaps!
US judge: YES, cops or feds so can slurp an ENTIRE Gmail account
Crooks don't have folders labelled 'drug records', opines NY beak
FLAPE – the next BIG THING in storage
Find cold data with flash, transmit it from tape
Seagate chances ARM with NAS boxes for the SOHO crowd
There's an Atom-powered offering, too
Intel teaches Oracle how to become the latest and greatest Xeon Whisperer
E7-8895 v2 chips are best of the bunch, and with firmware-unlocked speed control
prev story


Top three mobile application threats
Prevent sensitive data leakage over insecure channels or stolen mobile devices.
The Essential Guide to IT Transformation
ServiceNow discusses three IT transformations that can help CIO's automate IT services to transform IT and the enterprise.
Mobile application security vulnerability report
The alarming realities regarding the sheer number of applications vulnerable to attack, and the most common and easily addressable vulnerability errors.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Consolidation: the foundation for IT and business transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.