Microsoft, nVidia tag team on HPC
Making GPUs sit up and sort
Employing graphics chips as co-processors to do tough computing tasks is not as simple as plugging in some electronics, adding a few libraries of code, and letting it rip. But it ought to be something like that, which is why graphics chip maker nVidia and desktop and server operating system maker Microsoft are working together to make GPUs more useful for Windows boxes.
In some cases, the work that Microsoft is doing will apply not only to nVidia's Tesla family of GPU co-processors, but also Advanced Micro Devices' FireStream alternatives. In other cases, according to Sumit Gupta, senior product manager at nVidia, the work is precisely about tuning Windows PCs and servers to squeeze the most performance possible out of Tesla co-processors in particular. It's more than making Windows operating systems see Tesla GPUs, which they already do. The collaboration is about making applications fully use the substantial mathematical capabilities of GPUs.
nVidia's Tesla co-processors are already supported by Windows XP and Windows Vista on the desktop and Windows Server 2003 and Windows Server 2008 on the server. (You don't need the Windows HPC Server 2008 variant to make use of GPUs.) And the forthcoming Windows 7 operating system, which debuts on October 22, will have GPUs automatically enabled through Microsoft's DirectCompute APIs, which are an add-on to the DirectX 10 and DirectX 11 graphics APIs that have been woven into Windows Vista and Windows Server 2008 to expose the GPU so it can be used automatically by applications to perform various mathematical calculations.
A lot of the brain work being done by the two companies as part of the collaboration announced today is actually being performed at Microsoft Research, which has just installed a parallel x64 server cluster running Windows and a whole bunch of Tesla GPUs. (The exact configuration has not been divulged by Microsoft Research.) One project at the software giant involves tweaking sorting algorithms that are used in various calculations to exploit GPUs to make sorts faster.
One example takes algorithms popularly used in databases for sorting data culled from fields and goosing them with the GPU. Another takes algorithms for data mining and enables segments of the code to be accelerated by the GPUs. Microsoft has also tuned applications that make use of fast Fourier transforms (FFTs) to run using the Tesla GPUs (these tunings are only available for Tesla GPUs, not FireStreams). FFTs are the basis of signal processing.
In some cases, Microsoft's researchers are using nVidia's CUDA programming environment and its C compiler extensions to create GPU-enabled code, and in other cases Microsoft is just working through the DirectCompute APIs.
Another thing the two are working on is bringing their respective tunings for the Linpack parallel-computing benchmark in sync. nVidia has an implementation of the Linpack Fortran benchmark that it has created using the CUDA environment, and Microsoft has a tuned version of the Linpack test for Windows HPC Server 2008.
nVidia has done some coding of its own as well, with its research arm creating some applications for Windows HPC Server 2008 that make use of Tesla cards, including a ray-tracing application for doing photorealistic rendering.
What Microsoft and nVidia have not announced is the nVidia CUDA programming environment for Teslas being bundled into the workstation versions of Windows 7 or Windows Server 2008, or at least the HPC variant of the server platform. This is an obvious thing to do, just to make it easier for people who want to make use of GPUs as they create and run HPC applications.
"That's a good roadmap item for us to consider," says Gupta, not tipping his cards to say if the two companies are contemplating such a move.
Grid and cluster computing software maker Platform Computing was smart enough to see this was an obvious way to leverage the increasing popularity of GPUs in the HPC realm, and announced its own bundling deal for CUDA back in August. Platform really had no choice but to do some integration with CUDA and its GPUs, since its Load Sharing Facility workload manager for HPC clusters has to be able to dispatch work to the GPUs just as it has done for a long time for central processors on server nodes in a cluster.
If Microsoft and nVidia want to do something else useful, they could get the CUDA environment up to speed supporting C++ and Fortran compilers. Right now, there are some interfaces into CUDA for these two languages, but C is the only language with full support of the GPUs. ®
Sponsored: IBM FlashSystem V9000 product guide