VMware: We're gonna patent hot-swapping your VMs' host OS

Changing operating system updates forever

Comment VMware looks set to renew its relevancy with a new patent application. The patent application lists inventors Mukund Gunti, Vishnu Sekhar and Bernhard Poess and assigns the patent to VMware. The short version of the patent is that, if granted, VMware will have effectively patented the ability to hot swap a host server's operating system underneath running virtual machines and other applications.

If I am reading the patent application correctly, VMware has the potential to patent the entire future of IT here. Virtualization was cute and all, but pulling the OS out from under a running application and putting a new one in it's place is a bit holy-grail-ish, and represents – amongst other things – the first crack at designing an solution for containers that looks like it's actually ready to be used by grown ups.

Let's rewind a bit.

The problem to be solved

As a general rule, a computer can only run one operating system at a time. Virtualization as we know it today is still just one operating system – the hypervisor – which runs a bunch of applications (virtual machines) which in turn run other applications and so forth. You can go down that rabbit hole as many layers as you choose, but at the end of the day there is still only one operating system in charge of the hardware.

Sort of. The IOMMU is used to grant virtual machines direct access to some of the actual underlying hardware. This makes things like physically assigning a NIC or a GPU to a VM possible without a bunch of nasty translation layers.

That's cool and all, but eventually you need to update the hypervisor. To do this in today's world you'd need to migrate off all the VMs to another system, update the hypervisor, and migrate the VMs back. This not only requires that everyone keep spare hardware around, but it also means that VMs or applications reliant on specific hardware (such as a passed-through GPU or NIC) can have problems if those resources aren't available on the temporary host.

Virtualization is only the start of this problem. Maybe you run a couple dozen or even a few hundred virtual machines on one of today's serves. You can run thousands of workloads on that same server if you use containers. And containers don't migrate to another host as easily as VMs.

This is before we discuss bare metal operating systems, which are still in use today, and run critical workloads. Your storage array is probably an example.

Nobody wants to reboot these machines for updating, but unless we want interesting web nasties to crawl into our infrastructure and pwn us, update we must. Today, that means more spare systems than we really need.

Solving the problem

What if you didn't have to reboot the computer to update the operating system? Not a VM operating system. Not something in a container. The bit on the metal. The bit that when you turn the computer on, boots first. What if you didn't have to reboot in order to update that?

That's VMware's patent application, right there.

On the "how", the really short version is this:

1) When the operating system boots, grab all the POST info that is normally handed off to the bootloader and tuck it away. 2) When you want to swap in a more updated operating system partition off some physical resources (a CPU, some NICs, RAM, etc) potentially using hardware tricks like IOMMU 3) Boot a second operating system on that partitioned off hardware using the second CPU by feeding the bootloader the POST info captured during the prime operating system's boot. 4) Migrate running applications, VMs and/or containers to the newly booted operating system. 5) Terminate the prime operating system and return its resources to the new operating system.

Encapsulating the applications for transport is the easy part. VMware has lots of tech around this. App Volumes for instance. They don't even have to do fancy vMotion-like migration involving the application's memory because the application isn't going anywhere. It's staying put on that hardware, the new OS just needs to be made aware of it.

Before we go further, stop thinking about this just in terms of x86 servers for a moment. Start thinking about switches. IoT devices. Medical equipment. Air traffic control systems.

If this works, in theory telcos could update switches and routers without rebooting. Connectivity outages could be greatly diminished. In theory it would be possible to bring up a second OS, clone the live app, run an automated test suite against the new OS/cloned app without it able to affect live systems, verify that it works, terminate the app clone and then switch the app over to the new verified system without interruption to mission or life-critical applications.

Maybe all the theoretical hotness isn't to be seen in the 1.0, but I trust The Register's readers are capable of taking the tech and running with it.

Deeper meaning

We can't keep moving workloads off of servers every time the operating system, hypervisor, microvisor or whatever you want to call it that lives on the metal needs to be patched. Not with the addiction of modern developers to "Continuous Integration", and double especially not when we start talking about densities beyond today's virtualization technologies.

We can't stick with low-density servers either. Not too long from now, we'll run out of power. Our demand for internet pornography and spying on our neighbours seems to be in conflict with both the rate at which we build electrical generation capacity and physics as regards lithography shrinks.

I can gleefully run 5000 containers on a standard 2 socket server today. The hardware to do so with adequate CPU cores and RAM really isn't all that expensive. How many workloads can I run on tomorrow's hardware? 5 years from now? 10?

That's before we talk about the part where not all workloads make it through a host migration. When you're talking about moving a few thousand workloads a year, the error rates are barely noticeable as the tech is honestly pretty good.

But migration tech isn't infallible. What would a "rolling update" look like at a cloud farm in 2030 when we start talking about workload densities that could be in the tens or even hundreds of thousands per server?

And that's before we even talk about the unspoken promise of this tech: being able to run multiple bare metal operating systems on the same server in order to carve it up for different container requirements, testing, etc.

If the tech talked about in VMware's patent application works – and that's a pretty big if – and if they are granted the patent, VMware would own a very important core technology of the next decade's IT. Peak VMware may well be a ways off yet. ®

Sponsored: Minds Mastering Machines - Call for papers now open

Biting the hand that feeds IT © 1998–2018