Virtualization and HPC - Will they ever marry?
SC08 Server virtualization has spent the past several decades moving out from the mainframe to Unix boxes and then out into the wild racks of x64 servers running Windows, Linux, and a smattering of other operating systems in the corporate data center. The one place where virtualization hasn't taken off is in high performance computing (HPC) clusters.
And for good reason. But as hardware costs continue to plummet, making hundreds of teraflops of raw computing power in a parallel x64 server cluster available to even medium-sized businesses, startups, academic institutions, research facilities, and other places where HPC clusters end up - and at a relatively modest price - the system administration demands on HPC labs and the desire for more flexibility may possibly - and I mean possibly - see the adoption of server virtualization technologies in this subsegment of the server space.
Roughly speaking, HPC clusters account for about a fifth of the shipments of x64 server boxes each quarter. And according to IDC, in 2007, HPC boxes of all types - including vector, cluster, and other types of gear - accounted for $10.1bn in sales (revised downward from an initial $11.6bn estimate that came out in March of this year). That gives HPC an 18.6 per cent take of the $54.4bn in server sales in 2007, again about a fifth of the piece.
But the interesting bit is that if you take HPC machines out of the picture, general-purpose sales would be nearly flat for 2007. And equally importantly, if you remove the HPC boxes from the mix, then the adoption rate on new server sales for virtualization would be a little bit higher than the broader market stats cited by Gartner and IDC.
HPC customers, as a rule, do not use server virtualization because of the overhead this software imposes. The benchmark tests that server virtualization vendors such as VMware are beginning to use - I am thinking here of VMark, but also the two-year-old SPEC virtualization benchmark effort that has yet to bear fruit - do not show the overhead their hypervisors impose.
But as the x64 platform got virtualization hypervisors a number of years ago, the performance penalty was as high as 50 per cent on some workloads, and even after hardware features to support virtualization have been added to x64 chips from Intel and Advanced Micro Devices, the overhead is widely believed to be in the range of 10, 15, or 20 per cent. But seeing as though there are no independently available tests, customers really have to do their own benchmarks. And by the way, the terms of the ESX Server licensing agreement from VMware apparently do not allow people to publish the results of benchmark tests.
Next page: No time for the virual
HPC will be the making of Virtualisation!
Without a doubt, the current and next generation of virtualisation hypervisors have little to offer HPC because the management objectives are currently orthogonal, with virtualisation focused exclusively on sharing resources.. but there are common requirements for provisioning, configuration, acquisition and release.
Virtualisation will be mature, when it offers benefits to HPC:
1. When the hypervisor is not just a master scheduler, but allocates dedicated resources in the way PRSM partitioned Amdahl mainframes more efficiently than VM/CMS
2. When the hypervisor can act as a loader and hand the full machine to the client OS in the way DOS did for Windows or V=R in VM/ESA with a wakeup handle in the hypervisor aware client OS.
3. When the hypervisor can discover peers and hyper-hypervisors on discovered networks, and discover topologies and capabilities.
When the time comes that the hypervisor is embedded like the BIOS, supporting VM, Partitioning and booting, the Hypervisor will be the kernel that even HPC is built on
getting the most out of HPC
I remember a talk I went to a few years back given by the guys who run our HPC farm. We have hundreds of users. One way to optimize performance is to run multiple jobs on each CPU. Under their testing they found that about 5-6 jobs per CPU is optimal,depending on the processor.
Say each job takes 1 hour and you have 5000+ jobs in your que with each user submitting hundreds of jobs to your farm. The user may expect his/her batch of jobs to take many days. Typically the hold up in any one job is disk I/O. The idea of multiple jobs on single CPUs is that if one job gets held up because of disk I/O then the other 4 jobs running on the CPU increase their share of the CPU.
1 job per CPU * 6 CPUs = 7 hours
6 jobs per CPU * 1 CPU = 6.5 hours
The key ingredient in this approach is a decent kernel scheduler (no prizes for guessing what OS we use/don't use). Anything like virtualisation is pointless CPU overhead hogging suggested by management types who want to use buzzwords without really knowing what their on about.
The University where I work is trialling the use of virtualisation running on top of labs full of Windows desktop machines to provide UNIX HPC grid functionality without the user interruption or control/management hassle of dual booting large numbers of PCs. By all accounts it appears to be a reasonably functional and easily managed model for augmenting our other grid nodes (for suitable workloads, of course.)