Feeds

Cray XK6 super mates Opterons with Nvidia GPU workhorses

Ceepie-geepies all around

  • alert
  • submit to reddit

Intelligent flash storage arrays

Not waiting on Kepler

As El Reg previously reported, Oak Ridge National Laboratory – one of the US Department of Energy's big nuke/supercomputing labs in Tennessee – had already let slip two months ago that it was building a whopping 20-petaflops machine, called Titan and using a hybrid CPU-GPU design, that would start rolling into its data center later this year with full operation in 2012.

While neither Cray nor Oak Ridge have confirmed this, it is reasonable to presume that Titan is actually an XK6 super. It could also be a variant of it, perhaps with a higher ratio of GPUs to CPUs, which Bolding says will be available through Cray's Custom Engineering unit.

Cray XK6 super X2090 GPU

Nvidia's Tesla X2090 GPU coprocessor

Appro International, Dell, HP, and Super Micro have all announced hybrid CPU-GPU racks and blades that have a much higher ratio of GPUs to CPU sockets. More than a few of these vendors have said that, ideally, what customers want is one GPU per x64 core. Bolding says that for now, given the economics of CPUs and GPUs and the lack of workloads using GPUs, a one-socket-to-one-GPU ratio is best for the high-end customers it sells to. (This is the ratio that IBM used in the petaflopping "Roadrunner" Opteron-Cell hybrid machine, the first box to break through that barrier.)

The goal with the XK6 ceepie-geepie blades was to keep the XK6 in the same thermal envelope as the XE6. If Cray could do that, then it would not have to do anything special in terms of packaging or cooling to use the GPUs. Cray met that design goal, and you can pull out an XE6 blade and slide in an XK6 blade and nothing is going to melt, even though the X2090 GPU is rated at 225 watts peak. Companies can mix and match XE6 and XK6 blades in a single system, and Bolding says that many customers will do that so they can support different workloads.

The cabinets used in the XE6 and XK6 supercomputers can house up to 24 blades and have a peak-power rating of over 50 kilowatts as their design point, according to Bolding. That, however, is not how much power they will draw in the field – that will depend on how the applications hit the CPUs and GPUs, of course.

The fact that Cray is waiting for the Opteron 6200 processors for the XK6 and not just shipping this box right now with the current Opteron 6100s suggests that Nvidia is not actually ready to ship the Tesla X2090 GPU in the volumes that Cray needs. (Technically speaking, it has not even been announced yet.) It also suggests that while Cray has confidence in AMD's delivery of the Opteron 6200 processors, the future "Kepler" GPU expected by the end of this year is not going to be ready for whatever time Cray wanted to put it into the XK6 super.

Cray cannot, of course, give out any peak performance ratings on the Opteron 6200 side of the card, but each of the Tesla X2090 GPU cards is rated at 665 gigaflops doing double-precision floating-point operations. Each blade has four GPUs and there are 24 in a cabinet, so that gets you to 63.8 teraflops per cabinet just on the GPU side of the XK6 ceepie-geepie.

The current XE6 blades using the 12-core Opteron 2100s are rated at around 90 to 100 gigaflops per socket, according to Bolding. Add four cores and goose the clock speed a bit on the chips and you might be able to push it up to 160 gigaflops – or at least that is El Reg's guess. That works out to 640 gigaflops of CPU number-crunching power for four Interlagos sockets, which is pretty good compared to the 800 gigaflops that the all-Opteron, eight-socket XE6 blade can deliver.

Add it up, that's somewhere around 3.5 teraflops per XK6 blade of peak floating-point oomph, or something on the order of 83 teraflops per cabinet. With 200 cabinets – roughly the size of the "Jaguar" machine down at Oak Ridge – you are talking about something north of 16.6 petaflops, or nearly an order of magnitude better performance.

With the 3D torus of the Gemini interconnect, Cray can build out as far as 300 cabinets, so a ceepie-geepie super could hit around 25 petaflops. Replace the Tesla X2090s with some Nvidia Kepler equivalents, and a 300-cabinet XK6 super could deliver something on the order of 5.8 petaflops on the Opteron side and 29 petaflops on the Tesla side. You only need something like 15 megawatts of power for that behemoth XK6 supercomputer alone, and maybe 25 megawatts for the whole data center.

The launch customer for the XK6 ceepie-geepie super is not going to be Oak Ridge, by the way, but instead the Swiss National Supercomputing Centre, which is also taking delivery of the first XMT-2 massively multithreaded super, based on the Gemini interconnect and Cray's future ThreadStorm-2 processor.

One more thing: in case you are wondering, the new Cray box is called the XK6 because when they looked at XG6, the font for the G and the 6 looked weird next to each other. ®

Choosing a cloud hosting partner with confidence

More from The Register

next story
Cray-cray Met Office spaffs £97m on VERY AVERAGE HPC box
Only 250th most powerful in the world? Bring back Michael Fish
Just don't blame Bono! Apple iTunes music sales PLUMMET
Cupertino revenue hit by cheapo downloads, says report
The DRUGSTORES DON'T WORK, CVS makes IT WORSE ... for Apple Pay
Goog Wallet apparently also spurned in NFC lockdown
Microsoft brings the CLOUD that GOES ON FOREVER
Sky's the limit with unrestricted space in the cloud
'ANYTHING BUT STABLE' Netflix suffers BIG Europe-wide outage
Friday night LIVE? Nope. The only thing streaming are tears down my face
Google roolz! Nest buys Revolv, KILLS new sales of home hub
Take my temperature, I'm feeling a little bit dizzy
Cisco and friends chase WiFi's searing speeds with new cable standard
Cat 5e and Cat 6 are bottlenecks for WLAN access points
prev story

Whitepapers

Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.
Getting ahead of the compliance curve
Learn about new services that make it easy to discover and manage certificates across the enterprise and how to get ahead of the compliance curve.