Original URL: http://www.theregister.co.uk/2010/04/13/xen_4_0_hypervisor/

Xen 4.0 adds Remus fault tolerance

Transcendental memory

By Timothy Prickett Morgan

Posted in Virtualization, 13th April 2010 23:31 GMT

The open source Xen project finally caught up to VMware in the release numbering wars, kicking out the 4.0 release of its eponymous hypervisor for servers and desktops.

Xen 4.0 follows Xen 3.4.2, which was announced last November, and Xen 3.4.1, which came out last August. Both are still being absorbed by Citrix Systems and Oracle (which have standalone, bare metal hypervisors built on Xen) and by Red Hat and Novell (which embed the Xen hypervisor inside their Linux variants so they can be used to virtualize desktops and servers).

The last major upgrade of the Xen hypervisor was in August 2008, when Xen 3.3 debuted and the Xen community started talking for itself separately from Citrix Systems, which has a lot of control over the project and which is trying to make money commercializing Xen in XenServer, XenDesktop, XenApp, and someday XenClient, all variations on the hypervisor theme.

With Xen 4.0, the hypervisor can now span 128 physical CPUs on the host, according to the release notes, with more than that available at compile time if you like CPU scalability.

The hypervisor can now address up to 1 TB of physical main memory, too. On the guests, a virtual machine running atop Xen 4.0 can be allocated up to 128 virtual CPUs too. (Memory capacities for VMs were not divulged). The hot plugging of CPUs and memory in a physical server without having to shut down the hypervisor or its VMs is now supported, which is a much-needed feature. The ability to resize guest virtual disks without having to reboot VMs is also a plus.

The Xen 4.0 domain 0 where all of the hardware drivers run defaults to the Linux 2.6.31 kernel, which supports the new processors coming out from Intel and Advanced Micro Devices, but you can get out on the bleeding edge with the Linux 2.6.32 kernel or step back to the 2.6.18 kernel if you like to be back closer to the pommel.

Xen 4.0 also supports fault tolerance through a project called Remus, which is being created at the University of British Columbia and which was at its 0.9 release level back in November when Xen 3.4.2 came out. (You can read a research paper on Remus here).

The Remus software clusters two virtual machines running on two distinct physical servers together, lets the backup VM run speculatively for a bit, and then gets an asynchronous update every few tens of milliseconds behind the primary VM. This is not quite as close as synchronous lockstepping, as you get in some fault tolerant hardware, but it is a lot closer to fault tolerance than relying on live migration of VMs off a machine before it fails, which takes minutes at best, not milliseconds, and lots of babysitting. Remus is a kind of constant live migration of VMs, so when something goes wrong, the backup system can pick up where you left off.

Xen 4.0 also supports the blktap2 driver for taking snapshots and cloning virtual hard disks (VHDs) where the Xen guests and their software stacks are encoded. The hypervisor has tweaks to allow for copy-on-write sharing of identical memory pages between multiple VMs on a server. It also supports what Xen calls transcendent memory (Oracle pun intended).

Basically, tmem, as the Xen lingo calls it, is a way of gathering up the unused physical memory of the system and letting virtual machines grab it as a sort of extended memory to add to their own allocation of virtual memory. (Some days, it seems like every byzantine architectural trick that was ever done in physical systems to squeeze out performance will have to be replicated on virtual machines).

The updated 4.0 hypervisor from the Xen community has had its virtual I/O improved and now makes better use of the Intel VT-d and AMD IOMMU features in the latest Xeon and Opteron chips, and VGA graphics cards can also pass through to allow VMs direct access to cards so they don't have emulated drivers.

Xen 4.0 runs on Intel and AMD 32-bit and 64-bit processors, and it runs on Intel's 64-bit Itanium chips, for whatever that's worth. You need VT and AMD-V virtualization features on the chip to run Windows guests, and the PCI peripheral pass-through requires the chips to have the more elegant VT-d and AMD IOMMU I/O virtualization extensions. ®