Double-Take helps Microsoft Flex HPC muscles

Diskless iSCSI booties

Microsoft is trying to wiggle its way into HPC shops with its Windows HPC Server 2008 variant, which includes similar tools that a Linux distro uses to support parallel supercomputing workloads.

Redmond apparently sees a nascent opportunity in the HPC space. Although the jury is still out about whether or not supercomputer users will ever sacrifice some computing power and memory to run hypervisors to carve up their cluster server nodes, HPC shops definitely want to be able to change operating systems and software stacks on their hundreds or thousands of server nodes more quickly than they can with sneakernet and grad students.

And that's the desire Microsoft wants to exploit — and it looks like the R2 update of its HPC Server, which is still in beta and which is expected later this year, will bring HPC Server 2008 to something close to parity to a Linux HPC stack as well as being able to support CPU-cycle harvesting from Windows 7-based PCs on the network. But to take on Linux in the HPC racket will require lots more capabilities.

So Microsoft approached Double-Take, one of its disaster-recovery software partners (which is in the process of being eaten by high-availability software vendor Vision Solutions for $242m), to adapt its Flex bare-metal provisioning and remote boot tools so it could support diskless provisioning of compute nodes over iSCSI links back to storage arrays for Windows HPC Server 2008. HPC shops are notorious cheapskates, so any piece of iron not in a server means less money spent and less heat, which in turn means more flops to do real work.

The resulting product, called Flex for HPC, can provision 100 diskless HPC nodes in under five minutes, according to Steve Marfisi, product manager for the Flex product line at Double-Take. It takes longer to do a larger number, and initial tests have shown it takes 90 minutes to juggle the images and fire them up for around 500 nodes. The reason is that the Windows cluster doesn't just boot from a raw OS image, but also has what Marfisi calls "differing disks", which Flex already had for its Windows replication software, so each node can not only get the raw image, but have its own unique software and data. This personalization was created for the Flex virtual desktop provisioning tool, where PC images are booted off the network but allow for each user to have some customization.

Flex for HPC only works in conjunction with Windows HPC Server 2008 R2 because it needs the iSCSI provider that Microsoft put into the R2 stack; the prior HPC Server 2003 and 2008 editions did not have this iSCSI provider. The tool also makes heavy use of the APIs in the cluster head node to control when what compute node gets updated, which Marfisi says has a sophisticated wizard that lets you "step back and get a coffee". Flex for HPC costs $1,995 for 40 compute nodes, plus $50 for each additional node after that.

What about Linux HPC nodes? Well, the existing Double-Take Flex product could be used to provide diskless Linux booting for each node on a cluster, but it doesn't have those head node wizards that automate the booting of compute nodes that Windows HPC Server 2008 R2 has and uses in conjunction with Flex for HPC. Wouldn't it be funny if you could make HPC Server 2008 R2 do remote booting of Linux images? We'll see if someone does a hack for that. ®

