Dashboard pushers: Dark here in containerised server land, innit sysadmins?

We show raw event log blizzards as usable correlated patterns, they croon

Boxster_dials
Boxster 2001 dashboard dials

Analysis Where the heck am I going? Being a sysadmin in a containerised environment can be like driving a car in fog with no lights and no instruments.

Coralogix is positioning itself as the sysadmin's fog light and dashboard instruments. Its machine learning software analyses logged events in a containerised run-time environment, detects and correlates patterns and so helps a system administrator diagnose and correct faults in a system. It brings clarity and detail to where before there was confusion.

This all very fancy. What does it actually mean?

The car dashboard instruments analogy

Let's liken a running set of containerised applications in servers to a car. Under the hood in both a car and a containerised server system there are a mass of constantly evolving operations. Different containers are instantiated from a registry, execute under the supervising container OS like the Docker engine, call other micro-services, send out IO requests, and get torn down, with the constantly changing set of containers representing an application in action.

In a car, a set of components operate. Fuel and air are delivered to the cylinders, pistons fly up and down, a crankshaft turns, sparkplugs fire, gears are changed, wheels are rotated, exhaust taken away, brake operated and so forth. The car has a driver and the driver operates the car helped by a dashboard display indicating the engine's rpm, the car's speed, the oil pressure, the selected gear, and other displayed states and events.

Many of the instruments calculate their output from raw inputs, such as the speedometer converting drive shaft rpm to the car's speed.

If something goes wrong, the dashboard helps with error detection and diagnosis.

A containerised application set generally has no such dashboard for an administration person, and nor does it have tailored raw data feeds to drive such a dashboard if it did exist, unless they have specifically been designed in.

That means, when there is an error or fault, the admin person can be operating blind. There are event logs, but these have generally been created by a container's developer to indicate progress and state as a coding/debugging aid. Unless you know the circumstances, the context, in which a logged event took place, it is quite meaningless, just as knowing that valve three in cylinder six of an engine opened at a particular time. In isolation and context-free, knowledge of that event is meaningless.

How can context be added to the operations of containers in a containerised app?

Step forward Coralogix.

Guy_Kroupp

Coralogix co-founder and CEO Guy Kroupp

Coralogfix

Coralogix is an Israeli startup building software to help you make sense of millions of log file records. In essence, it records the logs and then distils them into categories.

Coralogix_event_categorisation

Coralogix event categorisation

Logged events are recognised as being examples of archetypes, such as "User X ordered Y from category Z" and so identified and given a letter - "A" in this case.

This means that an order of magnitude or more reduction in the event search population has taken place; Coralogix calls its log aggregation - log-gregation for short - as billions of raw events become a much smaller number of categories.

You can see raw log arrival in real time from servers, the incoming blizzard as it were, and then switch to a log-gregation view and only see template-level activity.

The image above shows five categories that have been identified. The SW can then track these alphabetic identifiers as they occur and group ones that follow each other repeatedly.

In the image above we see "ABC" has been found to be an event triad and it represents a payment transaction. We might call this a template. So now the SW can track template occurrence in the log stream and reduce the item count even more.

Next the incoming event stream is analysed to see when and where these templates occur. It's rather like looking for DNA sequences in a gene.

Coralogix_Template_DNA_check

Coralogix template tracking

Once this has been done, the event stream monitoring can detect the frequency and timing of such templates and build a profile for an application of their normal frequency and timing pattern.

Then, and this is the kicker, if the application then behaves differently, Coralogix's claims its software can detect it and alert sydsadmins:

Coralogix_Template_extraordinary_occurrence

Template occurrence divergence from the norm

Because the software "knows" which raw events generated a template, it can then be used to drill down into specific event types from specific containers and so aid fault diagnosis, with suspected error suggestions, claim its makers:

Coralogix_Suspected_errors

Coralogix suggested errors display

Coralogix presented this capability of its software to a bunch of journos on an Israeli press tour. We got to see a demo and, of course, everything generally works well in a demo.

Coralogix background

The company was founded in 2014 by CEO Guy Kroupp, chief product officer Ariel Assaraf and chief science officer Lior Redlus. There is experience at the Israeli 8200 army intelligence unit (sigint and crypto) amongst these founders (Assaraf) and also with Verint.

It says it has 12 employees: five of which are based in the US, and 400 customers worldwide - not to be sneezed at. Two funding rounds have provided it with $3.5m in funding - peanuts by many startup standards.

Assaraf said: "We built a product around operational IT, Test and Dev ... We store raw data on Elastic Search for visualising and querying."

Kroup added: "We create few alerts - really low false positive rate." That's good to know.

The software is offered as a service and is to be found on AWS and Azure. The business model involves pricing on the amount of data sent to Coralogix by customers.

You can find out more here and get a 30-day free trial here.

Competitors include Logz.io, mentioned on The Register here, and Loggly, which received funding from Cisco back in 2013, before Coralogix was even founded.

Net:net

If your sysadmins of containerised systems are currently working in the dark when fault-fixing, then Coralogix claims its software will do the trick. In effect it retrospectively adds instrumentation to previously un-instrumented systems and gives your driving sysadmins a dashboard with which to better drive their containerised car. ®


Biting the hand that feeds IT © 1998–2017