Docker bags unikernel gurus – now you can be just like Linus Torvalds
Wannabe a kernel developer? Well, soon you can be and rather easily
Analysis Linux container biz Docker has bought Unikernel Systems, a startup in Cambridge, UK, that's doing interesting things with roll-your-own operating systems.
Rather than build an application on top of an OS, with the unikernel approach, you build your own tiny operating system customized for your application.
It's quite a coup for San Francisco-based Docker, as Unikernel Systems is made up of former developers of the Xen hypervisor project – the software that's used all over the world to run virtual machines in public clouds and private systems.
If you check through Unikernel System's staff on LinkedIn, you'll find folks like CTO Anil Madhavapeddy, David Scott, Thomas Gazagnaire and Amir Chaudhry, who have worked on, or are closely linked to, the development of open-source Xen. The team knows a thing or two about running apps in little boxes.
Why is Unikernel's work interesting? Well, let's remember what Docker is: it's a surprisingly easy-to-use tool that lets developers and testers package applications, and the bits and pieces those apps need to run, into neat and tidy containers that are separate from other containers.
On Linux, it uses cgroups and namespaces, among other mechanisms, in the kernel to keep containers isolated. So if you want to stand up a web server with a particular configuration, build (or download) that container and start it. Likewise if you need a particular toolchain: build, or find an image of, the container that has the necessary compilers, libraries and other dependencies. You don't have to worry about the dependencies and processes in one container interfering with another's; each box is kept separate.
Boxed in ... Three containers running on one kernel, each
containing their own apps and dependencies
All the containers on a machine share the same underlying Linux kernel – or the Windows kernel if you want to run Docker on Microsoft's operating system. Docker tries to be easier-to-use and more efficient than building, starting, and tearing down, whole virtual machines that each have their own kernels and full-fat operating systems.
Each Docker container not only has its own process tree but its own file system built up in layers from a base. The container has just the software it needs to perform a particular task, and no more. Thus, these boxes are supposed to be far more lightweight than virtual machines.
Unikernel Systems takes that streamlining one step further.
Heard of a kernel, but what's a unikernel?
Unikernels or library operating systems have been lurking in the corridors and labs of university computer science departments for roughly twenty years. Network hardware vendors have adopted them for their firmware to provide reliable and specialized services.
How do unikernels work? Take all that we've just said about a container – its processes, dependencies, file system, the underlying monolithic host kernel and its drivers – and compress it into one single space, as if it were a single program.
Confused? Let's fix that. Take your typical Linux machine. Open a terminal and run ps aux and get a list of all the processes running on the computer. Each of those processes, a running program, has its own virtual address space, in which its own code, its variables and other data, and its threads, exist. In this space, shared libraries and files can be mapped in. Right at the top of this virtual space, in an area inaccessible to the program, is the kernel, which appears at the top of all processes like God sitting on a cloud, peering down on the mere mortals below.
If a process wants the kernel to do anything for it, it has to make a system call, which switches the processor from executing code in the program to executing code in the privileged kernel. When the kernel has carried out the requested task, the processor switches back to the program.
Let's say a process wants to send some data over a network. It prepares the information to transmit, and makes the necessary system call. The processor switches to the kernel, which passes the data to its TCP/IP code, which funnels the data in packets to the Ethernet driver, which puts frames of the data out onto the physical wire. Eventually, the processor switches back to the program.
A unikernel smashes all of this together: the kernel just ends up being another library, or a set of libraries, compiled into the application. The resulting bundle sits in the same completely accessible address space – the kernel isn't sectioned off in its own protected bubble.
When the application wants to, say, send some data over the network, it doesn't fire off a system call to the kernel to do that. No context switch occurs between the process and the kernel. Instead, the program calls a networking library function, which does all the work necessary to prepare the data for transmission and makes a hyper call to an underlying hypervisor – which wrangles the network hardware into sending the data on the wire.
The unikernel model yanks parts of the kernel a program needs from the kernel's traditional protected space into the program's virtual address space, and shoves the hardware-specific work onto the hypervisor's desk. It is the ultimate conclusion of paravirtualization, a high-performance model of virtualization.
Old versus new ... Traditional monolithic kernel design and containers, left, and unikernel apps on a hypervisor or bare-metal
As illustrated above, the kernel's functionality – the green blocks – is moved from a monolithic base underlying each container to a sliver of libraries built into applications running on top of a hypvervisor.
This means unikernel apps do not have to switch contexts between user and kernel mode to perform actions; context switching is a relatively expensive procedure in terms of CPU clock cycles, and unikernels do away with this overhead and are therefore expected to be lean and mean.
The apps are also extremely lightweight as only the kernel functionality needed by the software is compiled in, and everything else is discarded. Thus the apps can be started and stopped extremely quickly – servers can be up and running as soon as requests come in. This model may even reduce the number of security vulnerabilities present in the software: less code means fewer bugs.
If you want to run a traditional POSIX application, you can using what's called a rump kernel. This provides just enough of a normal operating system to run a Unix-flavored application unmodified. One example is rumprun, which can run loads of packages from PHP to Nginx in a unikernel setting. Antti Kantee has been doing a lot of work on rump kernels, as has Justin Cormack, an engineer at Unikernel Systems.
One thing to note is that a hypervisor isn't always required: a rump kernel – think of it as a kernel-as-a-service – can provide drivers for hardware, allowing the unikernel app to run right on the bare metal.
Alternatively, driver libraries could be built into the apps so they can talk to the hardware directly, which is especially useful for embedded engineering projects, aka Internet-of-Things gadgets. Unikernel Systems' software can run on bare-metal ARM-compatible processors, and on systems without memory management units – the sorts of hardware you'll find in tiny IoT gear.
One tricky aspect to all of this, depending on your point of view, is the requirement to trust the unikernel applications. Rather than have an underlying kernel to keep individual processes, containers, and virtual machines in line, the unikernel apps have to share the machine with no one in overall control. There are ways to use the processor's built-in mechanisms – such as page tables and the memory management unit – to keep them isolated. Building a model to keep them in check will be something Unikernel Systems and Docker will be working on.
On the other hand, if you want to run untrusted code, you'd most likely want to do that in a virtual machine with complete software isolation; unikernel apps are supposed to be trusted services set up and run by administrators.
Speaking of which, this is another tricky aspect: managing the things. Unikernel apps are like microservices on steroids. Deploying them, getting them to work together, and so on, in a elegant and scalable manner is really tricky. And that's where the Docker acquisition comes in.