Platform shared with CUDA in software bundle deal

Computing management software firm shares the load

Next gen security for virtualised datacentres

Grid and cluster computing management software company Platform Computing has inked a distribution agreement with graphics chip maker nVidia. It will see the HPC expert bundle nVidia's CUDA programming environment with its cluster management tools and integrate it into those tools.

The move will allow Platform Computing's Load Sharing Facility (LSF), the backbone of its open and closed source products, to dispatch applications to nVidia's Tesla GPU co-processors much as it dispatches work to regular CPUs inside HPC clusters. These tend to be built using x64 processors with either Gigabit Ethernet or InfiniBand interconnect, except in the most exotic supercomputer labs.

Platform is integrating the CUDA programming environment with two of its products: its open source Platform Cluster Manager and HPC Workgroup Manager, which adds in LSF, a workload management and dispatching tool. Workgroup Manager is a bundle of tools aimed at companies with 32 nodes or fewer and has lower prices than the regular Cluster Manager plus LSF tools. Cluster Manager is distributed for free, but one year of support costs $150 per server node and three years of support costs $300 per node.

HPC Workgroup Manager includes LSF and a one-year support contract for the bundle on a two-socket server node costs $250 and a three-year contract costs $640. There is no incremental charge for the CUDA development kit or support for it.

The CUDA kit that Platform is distributing includes an extended C runtime that allows routines to be dispatched to the 240-core, 1 teraflop Tesla GPUs instead of to central processors. BLAS, FFT, and a number of other math libraries are also added to the kit. It also includes a GPU hardware debugger, profiling tools, and code samples for dozens of popular HPC algorithms.

Platform Workgroup Manager takes the open source cluster manager and adds LSF, its graphical workload management tool, and its own message-passing interface (MPI) clustering protocol stack, which it bought from Scali.

As I explained back in March, when supercomputer maker Penguin Computing launched its own CPU-GPU bundles, nVidia needs to get the CUDA environment supporting C++ and Fortran to make it truly useful for HPC customers. There are some interfaces, so C++ and Fortran applications can tickle the GPUs.

As of the spring, nVidia had shipped over 100 million CUDA-compatible GPUs, although most of them were probably not in supercomputer clusters. The nVidia Tesla GPU co-processors are supported in both Linux and Windows environments.

The LSF tool provisions and monitors workloads running on clusters, and can do so down to the CPU core level on server nodes and down to the GPU level on the co-processors. This is thanks to the integration of the CUDA development kit with the Platform cluster manager products.

Tripp Purvis, vice president of business development at Platform, hints that other co-processors may get some platforms with the cluster management tools, including Advanced Micro Devices' Firestream, Intel's Larrabee, and IBM's Cell chips.

"As these technologies become more commercially available, we will pursue them," says Purvis.

Well, two out of three nVidia GPU alternatives mentioned above are technically available, but they are not exactly going mainstream like the Teslas seem to be doing. And Platform's support could, in fact, drive adoption of these alternatives and help foster a little competition.

AMD's Firestream SDK includes a programming language called Brook+, which is a Firestream-enabled version of the Brook open source C compiler.

IBM's Power-based Cell co-processors (technically known as the PowerXCell 8i chips and running at 3.2 GHz) have been around for a few years now. They are notably deployed on blades alongside of and linked to Opteron blades in the 1.1 petaflops "Roadrunner" supercomputer installed at the U.S. Department of Energy's Los Alamos National Laboratory. This is the fastest supercomputer in the world, at the moment.

Cray is giving it a run for the money with its own XT5 Opteron clusters. IBM has its own multicore acceleration SDK that is packaged up with the Cell processors, and Platform could integrate this today if it wanted to, much as it has done with nVidia's CUDA. IBM only supports Red Hat Enterprise Linux 5.2 and Fedora 9 with its Cell SDK, however, which limits its appeal somewhat.

According to a report this week in SemiAccurate, Intel is expected to put the 32-core Larrabee chips into machines as graphics co-processors during the "Haswell" generation of chips in 2012. That's three generations away from the current Nehalem cores used in PCs and servers.

There is also talk that Larrabee will debut early next year in external graphics cards in a 32-core chip rated at around 2 teraflops of single-precision floating point performance, and hence could appear in GPU-based co-processors for HPC workloads.

No money is changing hands between Platform and nVidia. Platform wants to leverage integration to sell its software, and nVidia wants the company to do that to help peddle its GPU co-processors. The HPC resellers and cluster makers obviously need to have their own partnership with nVidia and the Tesla products to be able to support Platforms's integrated CUDA environment. ®

The essential guide to IT transformation

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
US regulators OK sale of IBM's x86 server biz to Lenovo
Now all that remains is for gov't offices to ban the boxes
Death by 1,000 cuts: Mainstream storage array suppliers are bleeding
Cloud, all-flash kit, object storage slicing away at titans of storage
Oracle reveals 32-core, 10 BEEELLION-transistor SPARC M7
New chip scales to 1024 cores, 8192 threads 64 TB RAM, at speeds over 3.6GHz
VMware vaporises vCHS hybrid cloud service
AnD yEt mOre cRazy cAps to dEal wIth
El Reg's virtualisation desk pulls out the VMworld crystal ball
MARVIN musings and other Gelsinger Gang guessing games
Object storage bods Exablox: RAID is dead, baby. RAID is dead
Bring your own disks to its object appliances
prev story


Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
7 Elements of Radically Simple OS Migration
Avoid the typical headaches of OS migration during your next project by learning about 7 elements of radically simple OS migration.
BYOD's dark side: Data protection
An endpoint data protection solution that adds value to the user and the organization so it can protect itself from data loss as well as leverage corporate data.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?