Do containers stack up as data storage building blocks?
Sounds odd but it could work - disk controller heads are stateless
Storage Architect There’s an almost religious divide between those who see containers as entirely stateless objects and others taking a more pragmatic approach that says state and containers is an inevitable thing.
In the stateless model, data is assumed to be replicated and protected by many container instances, so the loss of any individual container doesn’t mean you'll lose data. In practical terms, this idea just doesn’t work, because in the enterprise, we have to meet a set of standards around application availability, auditing and compliance.
Assuming we want to containerise our databases (rather than relying on them remaining as virtual machine instances) and we surely will, then persistent data is as inevitable as death or taxes. However, what about a more contrary approach? How about building storage systems from containers?
The persistence of data is due to the media we store it on, not the system through which we access it. As an example, many vendors provide the ability to perform head upgrades on their dual controller-based systems. This is because persistent data and configuration information is stored on the media (HDDs and SSDs) and in many cases the media is self-describing. This means if we have a software crash, theoretically metadata and configuration information can be re-read by parsing the data on disk.
Taking this idea to its logical conclusion, we can use stateless processes like containers to create storage systems, if we ensure that state is stored on the physical media (and protected across that media). If the container running our storage platform crashes, then we simply respawn it and read configuration data back from disk.
Building storage with containers
We are starting to see containers edging into the deployment of storage solutions. There are a number of reasons this is a good thing; firstly if we’re already running containers, then accessing storage on one of those containers provides a lightweight way to get to our data. Docker already implemented something like this with their data volume containers (see this link on Docker storage options. Second, containerising storage means we can build storage features as separate microservices, making management, upgrading and patching much easier.
Other vendors are starting to bring products to the market with the idea of using containerised storage. Scality, an object storage vendor, recently released their S3 Server, a cut-down containerised version of the Scality RING platform written in node.js. This runs as a single container image and so has limited support/availability but provides a process to test S3 compatibility with Scality RING. We could imagine the offering could be extended in the future to have more functionality.
StorageOS, a UK startup, has built a storage platform that runs in containers, for containers. The container footprint is (at present) a mere 40MB, which is an amazing achievement, although I can see this increasing as more functionality is added. Dell EMC’s VNX platform uses containers to implement VDMs (Virtual Data Movers). Portworx also has a storage solution that is build from containers. Currently this is available as a Developer (PX-Developer) edition that can be downloaded from GitHub, or an Enterprise edition (PX-Enterprise).
The Architect’s view
The lines between storage and application are being blurred with the idea of using containers for data persistence. HCI (hyper-converged infrastructure) set the scene for the ability to run storage and application services on the same hardware, storage with containers takes this to another level.
As with all storage solutions, one product doesn’t fit all requirements, so the idea of storage containers will (initially at least) have limited application. However expect to see more more solutions come to market as Software-defined storage starts to find a true niche. Please let me know if you have any other examples of containers being used to deliver storage and I’ll add them to this post.
Sponsored: What next after Netezza?