Contain yourselves: Coho Data looks Docker up and down

Dedupe directly on the array, promises CTO

Shipping_containers

Storage startup Coho Data thinks Docker containers could help its arrays run storage MicroServices directly on the array, like snapshotting, transcoding and deduplication.

It is rethinking storage services and how they are carried out on its MicroArrays and their flash/disk or all-flash object store.

These arrays use integrated servers, storage and networking and present themselves as a scalable NFS target.

Coho wants to have these arrays do storage-related compute, and thinks having micro-services running inside Docker containers is the most efficient and flexible way to do that.

Containers are skinny app holders compared with virtual machines which must have a guest OS alongside the app or apps. Docker puts the OS logic in a shared resource layer so that all the containerised apps can use it without having their own personal copies, thus making more efficient use of the host server's DRAM and compute cycles.

MicroArrays are dedicated storage systems,and Coho Data'a CTO and co-founder Andrew Warfield thinks they are ideally placed to do storage-related compute rather than the host servers they are networked to as an NFS filer.

He is thinking of processes such as the transcoding of video streams. Suppose a 4K format video file is written to the MicroArray. It needs to be converted to 1080p output. The networked host servers could run the transcoding logic, the video file read from the MicroArray, transcoded from 4K to 1080p and then written back to the array, making three network transfers and a slug of host CPU cycles used.

If the MicroArray could do the transcoding itself then that would be two network hops and let the host server CPU cycles be used for something else. How could that be done?

Warfield says that programmable data MicroServices are the way to do it. They could run snapshotting, thin provisioning, compression, deduplication, data indexing, auditing and search. Each MicroService runs in a container and has a REST API for communication with other MicroServices and the container orchestration logic.

He writes: "When you go to a web page, such as the Amazon front page, you may be staring at content that is generated by, and backed, by tens to hundreds of MicroServices: not just shopping carts, but search tools, referral engines, ad placement, and so on, each maintained by a different team and scaling independently in response to demand."

It's great for developers because they can concentrate on the MicroService app logic: "In a well-integrated MicroService-based development environment, developers should not be thinking about containers or scale: containers should be a central, but largely invisible tool in developing and deploying application logic. Instead, development teams should spend their time thinking about the application logic that their users care about ... containers are just a lubricant."

A diagram shows the concept in more detail:

Coho_microservice_diagram

Coho Data MicroServices concept (Click image for full-size version)

Warfield says there is a set of control logic, currently written by the user in JavaScript, which allows MicroServices to be activated in response to activity that happens in the storage system.

So MicroServices can be triggered by events, such as a the arrival of an incoming file and run automatically, being policy-driven.

A log monitoring MicroService can be activated whenever new logs are written to a log directory, scan that file for specific errors, and then append only those errors to a log of high-priority events that is stored in another directory.

Conversely, the MicroService can actually offer up a network-facing interface: that same log analyser can actually serve up a web dashboard in which the high-priority log entries are summarised.

Warfield claims the "[control logic] implementation borrows heavily from the AWS Lambda APIs, where similar JavaScript activations are being tied to events in storage components such as S3."

He told us that the "control logic can optionally embed more complex logic by including one or more containers, that are to be called into in response to [a] MicroService being activated. Containers are a really great match for this style of execution because in addition to being designed precisely for this sort of application packaging, they start faster than VMs, make way more efficient use of shared resources, but can still embed a huge array of existing applications that you might want to bring closer to your data."

A forthcoming post on the Coho blog will discuss this Microservice idea in more detail.

Coho's Docker containerised MicroService transcoding will will be demonstrated at the 2015 National Association of Broadcasters Show, April 13 -16 at Las Vegas, in booth #SL15717. ®


Biting the hand that feeds IT © 1998–2017