Fun and games in userland
A brief history of virtualisation
There's only one kernel, and generally, in most systems, there's only one big program running in user space. Sometimes it's called "userland", and it encompasses all the bits that you actually interact with.
As far as the kernel is concerned, userland can effectively be considered as one big program. One original parent process – on Unix boxes, traditionally called init – starts up all the rest and thus is the parent of the whole tree of dozens to hundreds of others.
So if you set things up so that the kernel is able to run more than one userland at a time, you can effectively virtualise the whole visible face of the OS. So long as the primary copy stays in control of the filesystem and the secondary copies are penned up in subdirectories, you can suddenly split your computer into multiple identical “virtual environments” (VEs).
There’s only one copy of the actual OS installed and only one kernel running, but you can have lots of separate root directories and install whatever you like in each of them without it affecting the others.
Each one starts with a skeletal copy of the filesystem with just the essential files it needs – which is what you do with the chroot command anyway – and then it can put whatever it likes, wherever it likes, and it all stays neatly penned up and separate from all the other software on the computer.
This is called “operating-system level virtualisation” or “kernel-level isolation.”
Every sysadmin's dream
No more “DLL hell,” no more clashing system requirements, no more trying to untangle which directories or files belong to which app. Apps are completely isolated, simplifying management – for instance, they can be removed without a trace, as every file the app ever wrote to disk is locked inside its VE.
So far, so good. Sounds like running a few Windows VMs on a server, doesn’t it?
But it isn’t. Because at the same time, there’s only a single install of a single OS to configure, patch and update; one set of device drivers; and rather importantly if you’re running a commercial OS, one licence to pay for.
No more “DLL hell,” no more clashing system requirement
To anyone who’s ever been a sysadmin, it’s a dream come true. Every app on the machine is locked away from every other one in its own little walled garden.It’s also very different from a performance or administration point of view – each VE is equivalent to just a program, rather than a whole OS instance. You don’t need to allocate storage to VEs – they all share the same pool of memory and disk space, managed by the kernel, so it is vastly more efficient to run multiple VEs than full VMs.
A dozen copies of a full OS under a hypervisor means thirteen times the hardware resources needed by one – so suddenly you need a dual-socket eight-core server with 32GB of RAM to make it all work. Not so with VEs – it only needs the resources for one OS plus the dozen apps.
It must be admitted that this approach doesn’t work for every type of program. If an application has to modify the kernel or change its behaviour, or needs kernel privileges, then you can’t run it inside a virtual environment, because that would affect all of them – so certain apps need to run in the base or parent environment. You can’t just install absolutely anything alongside anything else. Some don’t get along and won’t share.
But overall, VEs are a very useful and powerful tool. The snag is that at the moment, you only get this if you’re running a version of Unix with long trousers.
Solaris 10 calls VEs Zones, management by Containers, and it’s a very powerful implementation. For instance, each zone can have its own network interfaces and sets of user IDs.
Zones can be bound to particular processors for performance optimisation, but they don’t need a dedicated one, nor must they be allocated any memory. The OS not only supports zones offering the native Solaris API but also ones emulating older versions of Solaris as well as Linux-branded zones.
AIX 6.1 does it, too; IBM call them “workload partitions” – WPARs for short – as opposed to LPARs, IBM’s name for full-system virtualisation, as we described in the previous article. WPARs offer several levels of isolation, from some shared resources to none, right down to a single process. A running WPAR can even be migrated onto a different host server.
And for the Free Software user, FreeBSD offers "jails". Jails don’t have all the bells and whistles of their commercial rivals – you just get multiple instances of the same version of the same OS – but are still a very useful tool. Most of the other BSDs shared a similar implementation under the name "sysjails", but an insecurity in their implementation caused development to stop in 2009.
With Linux, the situation is more complex. VEs are not a standard part of the Linux kernel, but there are multiple competing tools offering variations on the same functionality. The newest and possibly simplest is LXC (“Linux Containers”), which builds on the cgroups functionality that’s been built into the kernel since 2.6.29. Rather more mature is Linux-VServer, which is sufficiently robust to allow other distributions’ userlands to be started inside a VE.
Probably the most capable for now is OpenVZ, which allows VEs to have their own network and I/O devices. OpenVZ is the basis of Parallels’ commercial Virtuozzo Containers product and its development is sponsored by Parallels.
Aimed at service providers, Virtuozzo builds on OpenVZ with additional management and provisioning tools. It can support a higher density of containers with closer management of their resources and it integrates with Parallels’ Plesk management tools.
Interestingly, Virtuozzo also runs on Windows. Parallels is not as well-known in PC virtualisation circles as it is on Linux and on the Mac, where its Parallels Desktop product brought several new features to Mac users wishing to run Windows. Virtuozzo Containers for Windows brings Unix-style partial virtualisation to Microsoft’s platform.
Each container takes only about 60MB of files, but appears from the management console to be a complete, independent machine – you can even assign it its own IP address and connect to it with Remote Desktop. Obviously, all the containers on a host run the same OS as the host itself, but the memory and disk footprint is dramatically reduced as you're only running a single OS instance.
Virtuozzo or something like it might yet cause a small revolution in PC virtualisation. If it does, going by the company's history, it’s likely that the technique will be imitated by Microsoft itself. Partly because that's what it's currently doing with Hyper-V, which is progressively acquiring more and more of the features of VMware’s VSphere and VCenter management tools. Mostly, though, because Microsoft is in the best position to incorporate OS-level virtualisation into its own OS.
To be fair, the notion of OS-level virtualisation is not a Parallels innovation – as we've discussed, it's been around for more than a decade and has been implemented in multiple OSs. Parallels is just the first company to make it happen on Windows. If the concept were to catch on in the Windows world, it would make virtualisation a great deal simpler, faster and more efficient.
In the next part of this series, we will look at the state of the virtualisation market today – and in the final one, where it might go next. ®
In the next part of the series, we explore corporate IT's fascination with virtualisation.
Sponsored: Becoming a Pragmatic Security Leader