One container to rule them all? No. Um, a plastic box* refresher

Real talk about this generation's coolest tech toy

Tupperware image via Shutterstock

Analysis Containers are the cool toy meaning two things: new technology and hype.

At heart, containers are simple: group the minimum set of files needed to run a particular program into a single directory tree, then run it with some kind of isolation mechanism, so that as far as that process is concerned, it's the only thing on the computer. No virtual hardware, so almost no overhead and lightning-fast startup and shutdown. The only additional element is a bit of metadata describing how the container talks to the rest of the system.

What's in the box?

Although there's a lot of overlap, there are two main types of container. The principle distinction is what program is executed when the container starts.

On Unix, the first program that runs after the kernel boots is init. It starts (and stops) everything else: it's the parent of all other processes, and controls the whole system. The much-reviled systemd is the most widely-used init these days.

If the first thing that runs inside a container is an init daemon, then in effect, each container is an independent Unix instance. Normally, the part of an OS outside of the kernel is called the "userland". Having a single kernel run several userlands was once called virtual environments. They're the original kind of container, and these days they're known as OS containers.

All of the containers on a machine share a single kernel, so they run in a single memory space and their root filesystems are just folders in the host's filesystem, but in some ways, they act like full VMs. Each can host multiple separate apps. They can have different init systems to the host, or even the userland of a different distro, so long as they can run on the same kernel. In principle, a single kernel could host Ubuntu, Fedora, CentOS and OpenSUSE containers. You can ssh into them and treat them like independent installations. They have their own config and probably local data, so they're often persistent.

Just one thing

The alternative is that rather than a whole OS instance, you isolate a single program: an app container. These don't have an init daemon: when you start the container, you're starting the app, and when the app terminates, the container does too.

App containers go hand-in-hand with microservices. For instance, you could split a classic LAMP stack into two containers: one running the Apache webserver and PHP, and another with MySQL, both communicating over an internal network.

In theory, they're intended to be transient. When you need more capacity, your management tool spawns more containers, and when the load drops, it stops them again. Excess ones are garbage-collected later.

App containers take over some of the duties of package managers. As each container is a complete ready-to-run bundle, you don't install and configure particular versions of various packages on the host This frees admins from a significant amount of work, as well as significantly reducing the complexity of the host OS. This is pushing radical changes in Linux distro design, which we will look at in the next article.

App containers have been popularised by the runaway success of Docker, but as ever, such success inspires both imitations and improvements, which we'll discuss later.

Old champions and the new contender

Both the original Linux OS containers implementations are around 15 years old and have been overtaken by the incorporation of the functionality into the main kernel tree.

The original "virtual environments" tool was Parallels' Virtuozzo, released for Linux in 2002. Its core was open-sourced as OpenVZ in 2005 and new releases are emitted each year. It offers richer functionality – better isolation, snapshots, live migration – than the kernel's built-in facilities, but the snag is that the way OpenVZ implements this requires modifications to the kernel. The patches have not been accepted into the mainline kernel in their entirety, so the full functionality of OpenVZ requires a custom kernel rather than your distro's standard one.

The other such environment was Linux VServer, aimed at ISPs and allowing a single physical server to appear to be multiple "virtual private servers", by splitting the userland into multiple "security context." There have been no new releases since 2007.

Today, though, the most common implementation is Linux Containers, or LXC for short the container system included within the Linux kernel itself. It uses two in-kernel features, namespaces and control groups (or cgroups), to isolate processes.

Processes communicate using various identifiers – PIDs, UIDs, GIDs, host- and domain-names, plus IPC. Namespace isolation gives each container a unique set of these, effectively walling them off. The Google-contributed cgroups feature provides resource limitation and accounting, prioritisation and control (that is, stopping, snapshotting and restarting).

LXC 1.0 was released in 2014 and cgroups 2.0 in 2016. Since these releases, processes inherit limits hierarchically and containers can be both privileged (running as root) and unprivileged. In combination with tools such as the btrfs filesystem, LXC allows advanced functionality like copy-on-write containers.

The newest Linux container system is Ubuntu's LXD ("lex-dem"), an extension to LXC. It aims to create and manage whole clusters of OS containers, which function, and can be managed, much like traditional VMs via a devops-friendly REST interface.

LXD builds containers from predefined images of various distros, which support their own network interfaces, storage virtualisation, even passthrough of dedicated PCI hardware in the host machine. It offers OpenVZ-like functionality such as live migration.

The new kid in town

All of these tools are mainly aimed at OS containers, but most of the noise is about app containers, as popularised by Docker. It's the original app container tool, although no longer the only one.

Docker is based around read-only templates called images, which can be layered – for instance, a specific app on a particular OS skeleton. These are distributed via repositories, complete with version control – Docker's central one, or your own.

It also offers tools for building and customising images, including automatically creating an image based on a program's requirements and dependencies. Deployment takes an image, either local or from a repo, creates a new container and populates it. Since each bundle is a single complete "fat app", in theory, they can be deployed onto different host distros.

Polluting the purity of the app container idea, you can even run multiple apps inside one container. You can just have a startup script that kicks off sub-apps, but then you don't get any logging or monitoring, and when the main app quits, the others merely halt rather than shutting down gracefully. To get these without the complexity of a full init system you can run something like Supervisor.

Later versions of Docker abstract container management through the libcontainer library, which can either use the kernel features directly, or via LXC, or via the ubiquitous systemd's systemd-nspawn. It can also use the libvirt abstraction layer, which can in turn talk to OpenVZ.

Up-and-coming rookies

Docker isn't alone. There are emerging standards for container and template file formats, such as the App Container Specification and Open Container Initiative.

It's possible to use other tools to run Docker containers – both LXC and system-nspawn can do this. Meanwhile, Bocker re-implements much of Docker's functionality as relatively tiny Bash scripts which call standard Linux tools.

Docker itself is distro-neutral, but specialised distros are starting to appear that are dedicated to running containers. One of these, CoreOS, supports both Docker containers and its own format, Rocket.

The container is in theory a simple construct. It is a concept whose time has come - thanks to cloud and to Docker.

With this break out has come the hype and inevitable attempts to turn containers into something both relatively risk free and dependable for the follow-on users, those outside the developer pioneer corps.

Also, we are beginning to see proliferation around specific platforms, with the inevitable optimisations and enhancements.

Where does this leave us? With a lot of choice and need for careful scrutiny. ®

* Following a letter from lawyers acting on behalf of the Tupperware Brands Corporation, of Orlando, USA, we are happy to amend our original headline, which appeared to refer to the products of Tupperware Brands Corporation, of Orlando, USA. We'd like to clarify that said brand has at no time had any connection with the particular subset of virtualisation software carrying out the function known as "containerisation".

There is no suggestion that the Tupperware Brands Corporation, of Orlando, USA, has ever wilfully engaged in or encouraged wanton acts of so-called "containerisation" other than as properly carried out with small plastic boxes borne by office drones to and from their places of work.


Biting the hand that feeds IT © 1998–2017