Feeds

Nvidia launches not one but two Kepler2 GPU coprocessors

Uncloaks Tesla K20, K20X extreme oomphers for servers, workstations

Maximizing your infrastructure through virtualization

Putting the K20X through the HPC paces

To test the relative efficiency of the Fermi and Kepler generations of GPUs, Nvidia grabbed a two-socket server with two Intel Xeon E5-2680 processors spinning at 2.7GHz, and dropped in two Fermi M2090 GPU coprocessors and then ran the Linpack Fortran benchmark on the box.

This setup delivers 1.03 teraflops of sustained Linpack performance, with a computational efficiency of 61 per cent – meaning that 39 per cent of the aggregate double-precision floating point performance of the system went up the chimney.

Nvidia then took the same server and yanked out the M2090s and slotted in two K20X coprocessors. The server was able to deliver 2.25 teraflops of sustained Linpack performance, and not just because the K20X is more powerful, but because the K20X is more efficient. In fact, 76 per cent of the aggregate performance in the server is actually brought to bear on the Linpack test thanks to the architectural changes in the Tesla K20 series of coprocessors.

How the K20 stacks up against a Xeon E5 and a Fermi GPU

How the K20 stacks up against a Xeon E5 and a Fermi GPU

In another test to show how the GPU coprocessors stack up against – rather than with – Intel Xeons, Nvidia fired up the DGEMM double-precision matrix math benchmark on an eight-core Xeon E5-2687, which is the 3.1GHz chip made for workstations, which was able to do 170 gigaflops.

A Fermi-based M2090 could do 430 gigaflops, and the Kepler-based K20X could do 1.22 teraflops. This test is important in that the DGEMM test is what Intel used to show a prototype Xeon Phi x86-based parallel coprocessor breaking through 1 teraflops on a single prototype card a year ago at SC11.

The K20 versus Xeons on various scientific apps

The K20 versus Xeons on various scientific apps

GPU accelerators are not just about servers, but also about workstations. Nvidia has spent some time in the labs running real workloads on Xeon or Core i7 workstations and seeing what happens when Tesla K20 or K20X coprocessors are added to the workstation.

On the MATLAB application shown in the chart above, a workstation with one i7-2600K processor ran some fast Fourier transform (FFT) routines, and then the same routines were run after slapping in a Tesla K20 coprocessor. The speedup was a factor of 18 because the MATLAB software speaks CUDA and the work lends itself to offloading to the GPU.

For the other tests, Nvidia used a workstation with two top-bin E5-2687W processors paired with two Tesla K20X chips, and the speedup for various applications ranged from a low factor of 8X to a high of 32X.

Adding K20X coprocessors to Cray supers speeds up apps big time

Adding K20X coprocessors to Cray supers speeds up apps big time

Nvidia and supercomputer partner Cray are obviously very keen to demonstrate that packaged applications can scale across hundreds or thousands of server nodes equipped with GPU accelerators, and chose to pit the QMCPACK materials-science application and the NAMD molecular-dynamics application through the paces on a Cray XK7 system both with and without K20X GPU accelerators installed.

The tests show that the GPU accelerators not only can speed up calculations with these two applications, but that as you boost the server node count in the XK7 machine – which uses Cray's "Gemini" 3D torus interconnect to hook nodes to each other – the performance of the ceepie-geepie box scales further faster than the CPU-only machine.

Gupta says that Nvidia has already shipped 30 petaflops worth of Tesla K20 and K20X coprocessors in the past 30 days. The K20 card will be available through workstation and server makers and through the retail channel where you normally buy graphics cards and other gear. The K20X, which is a fanless design, will be like the Tesla M2090 fanless coprocessor before it, and will only be available through server OEMs who tweak their machines to allow them to do the cooling for the Kepler cards. The channel will be getting K20 cards from Nvidia in the middle of this month in volume, with the server OEMs having the K20X cards available in November or December, depending on the OEM.

Pricing is not available on the new units, but El Reg estimates that the Tesla K20 card probably costs something on the order of $3,000 to $3,500 street, with the K20X commanding perhaps $500 to $1,000 more than that. At that price, two K20X cards will just about triple the cost of a server node, but will offer considerably more performance on workloads, as you can see from the data above. ®

The Power of One eBook: Top reasons to choose HP BladeSystem

More from The Register

next story
Sysadmin Day 2014: Quick, there's still time to get the beers in
He walked over the broken glass, killed the thugs... and er... reconnected the cables*
Auntie remains MYSTIFIED by that weekend BBC iPlayer and website outage
Still doing 'forensics' on the caching layer – Beeb digi wonk
SHOCK and AWS: The fall of Amazon's deflationary cloud
Just as Jeff Bezos did to books and CDs, Amazon's rivals are now doing to it
BlackBerry: Toss the server, mate... BES is in the CLOUD now
BlackBerry Enterprise Services takes aim at SMEs - but there's a catch
The triumph of VVOL: Everyone's jumping into bed with VMware
'Bandwagon'? Yes, we're on it and so what, say big dogs
Carbon tax repeal won't see data centre operators cut prices
Rackspace says electricity isn't a major cost, Equinix promises 'no levy'
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
Application security programs and practises
Follow a few strategies and your organization can gain the full benefits of open source and the cloud without compromising the security of your applications.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Securing Web Applications Made Simple and Scalable
Learn how automated security testing can provide a simple and scalable way to protect your web applications.