This article is more than 1 year old

Want to know the key to storage happiness? It’s analytics. Yeah, it’s that simple

Never has the phrase 'Jack of all trades' been more apt

Different types of analytics means different approaches

The easiest way to explain why it is difficult to analyse content and container at the same time is by using two startups, Nimble Storage and DataGravity, as examples. The first has one of the best implementations when it comes to infrastructure analytics, while the other is doing an amazing job with stored data.

InfoSight (the name of the Nimble product) collects 70-100 million sensors per day from each single array. Another important point about InfoSight is that it also has hooks in VMware vCenter and can get information on everything from the actual disk up to the metrics of each single VM.

And, this is the important part, they are all sent to the cloud. In fact, all the magic happens there. All your data (meta data actually) are stored together with data coming from all other Nimble customers, building a huge Big Data repository.

This means that InfoSight has the ability to show detailed graphics of your system being compared with the rest of the installed base. The value of this comparison is fundamental to analysing trends or getting tips about best practices for example.

At the same time, a cloud-based approach makes pro-active support much more reliable and efficient thanks to all the conditions that can be discovered before they become dangerous.

Nimble’s approach has been followed by many others (practically every startup that is building a primary storage system). There are some important reasons for building an analytics engine, as such:

  • It improves the quality of support services. Failure prediction is good for the end users but it is even better for the startup with a small support team.
  • It does not impact storage system performance. We are talking about primary storage here, the system only sends data to the cloud and everything is computed without touching your original data volumes.
  • The startup has an amazing, real-time feedback, particularly useful for quicker product improvement. The amount of collected data makes it easier to develop features by looking at what is measured in the field or on the users behavior (even when users can’t tell you what they exactly do with their systems).

DataGravity has an opposite approach. It crawls your data (locally) to find information that is not easy to access in any other way. No matter if data is stored in a VM, files or Blocks, it has the ability to find patterns, single strings, discover user behavior, who did what and when, and much more. There is a terrific video with a demo of the latest features recorded during the Tech Field Day at VMworld that explains a lot of the potential of this solution.

In this particular case, the system reserves parts of its resources (with second controller and a particular disk organisation) to do all its magic. In fact, doing all this activity on active data could affect front-end performance and at the same time, sending all your data to a cloud based analytics engine does not make sense at many levels.

Also in this case, DataGravity is leading but it’s not alone, and this approach has been followed by some startups that usually propose large NAS/Object scale-out systems. The reasons behind this type of design can be found in:

  • Having more control over data and information for security, auditing, regulation or policy reasons.
  • Providing modern tools to users of your organization that resemble things they usually use, like a Google-like search engine. Searching your storage like you do with Google is intriguing but some vendors are also developing APIs and specific query languages that will open the possibility to build interesting integrations with other applications.
  • Building new data protection schemes based on the fact that some (or all) data is already copied in a safe part of the array for the analytics part ... and you also have information on how they are effectively used and how important they are.
  • And building new features by analyzing how data is stored and moved around.
Next page: Closing the circle

More about

TIP US OFF

Send us news


Other stories you might like