Approaches to building the enterprise cloud

Ok, we are going to say that 'agility' word

Internet construction helmet

Sysadmin blog Data center technologies are constantly evolving, displacing their predecessors. Data center storage, and Hyperconverged Infrastructure (HCI) in particular, make for a good example.

HCI has been around for almost a decade. We're well along the hype cycle. We've seen outlandish marketing claims, watched vendors IPO and be acquired. We've read the promises and the FUD. After all of that, HCI is no longer novel. It just is.

HCI has become simply one more tool in the tool box. Rational administrators may disagree about where and when to apply it, when compared to the alternatives. But those who decry its very existence are fringe extremists to be placed in the same box as those railing against virtualization, or refusing to use x86 CPUs. At best the arguments extrapolate from a niche requirement and try to apply it to everyone.

In surviving controversy, HCI isn't unique. Pick almost any commonly used technology and before it became a go-to tech there was a period where a significant chunk of administrators thought that it was evil.

There are very real benefits provided by HCI. One can start with a relatively small deployment and grow both storage and compute as needed, making the initial trial phase easier. This lack of requirement for expensive forklift-class buy-in makes sharding existing clusters or introducing new ones easy and cost effective.

This can be an enabler of choice as it can make working with multiple vendors easier. In addition to allowing for hardware from multiple providers to be purchased in more bite-sized chunks, the ability to carefully add small clusters and grow as needed will show great benefits as the software enabling hypervisor heterogeneity gets better.

All of this makes HCI a great enabler for private and hybrid clouds. To understand why, however, it helps to understand a bit about why HCI became a thing in the first place.

Before SANs

Before HCI there was the Storage Area Network (SAN). Before the SAN, each server had its own local storage. Each server having its own local storage was a problem. Workloads grew and evolved. They needed more capacity, they needed to be backed up, restored, and made resilient against hardware failure.

Capacity could be solved in some instances by added another drive to a server. Assuming it had the capacity to add another drive – drive ports were scarce back then – and the operating system/application could handle data being kept on this other drive. Early Windows applications, for example, didn't much like things not being on C:\.

Along came RAID. In addition to offering new and interesting ways to expand drive capacity, (at the very least, RAID cards usually solved the drive port problem,) RAID cards solved the resilience problem by allowing for the loss of a single drive without the loss of data.

RAID had some downsides. It was expensive. It was complicated. It failed in mysterious and incomprehensible ways often requiring a specialist to keep it humming along smoothly. Perhaps more importantly, organizations were becoming utterly dependent upon the applications servers ran, driving up demand for clustering and other high-availability technologies where local storage proved to be a roadblock.

By the turn of the millennium storage had become a real nightmare for data center administrators. It was right around this time that SANs took off.

From solution to obstacle

At first, SANs were great. Server administrators could stop over-provisioning storage in the servers they bought, or trying (often in vain) to upgrade them as needed. Instead, they would simply ask the storage administrator to assign them storage, or resize when things got tight.

Initially, this wasn't viewed as a problem. Even large organizations didn't stand up that many servers in a year, and server administrators already had to wait on the networking team before they could stand up workloads. What difference did adding a storage administrator to the paperwork really make?

Along came virtualization. Sysadmins gazed upon it and saw that it was good. vMotion – the ability to move workloads between physical hosts provided you had shared storage – was a "killer app". Closely related was High Availability; the ability for another physical host in a cluster to restart a workload if the first host upon which it was operating failed. Again, this relied on shared storage.

The shared disk file systems used on SANs that were developed for clusters made virtualization sing. Sysadmins gazed upon this too and saw that it was good. Adoption of both virtualization and SANs soared. Soon, both became the default way of doing things. Everything else was "legacy". A niche looked on with distaste and, in time, even distrust.

SANs solved very real problems. The flaw in SANs was that they didn't scale particularly well. They were expensive, too: storage industry consolidation saw to that. But the real issue was that bit where server administrators had to go begging to the storage priests for their daily ration.

Business processes associated with storage as a discrete IT specialization domain worked just fine when an organization was lighting up 50 workloads a year. They fell apart when organizations started standing up 50,000.

Hyperconvergence

HCI solved the problems of centralized storage by putting the disks back into the individual servers. In order to preserve access to all the virtualization blue crystals that made modern data centers possible a layer of software was added that shared access to those disks amongst all physical hosts in a cluster.

If virtualization was the de facto workload management tool of the day, and virtualization operates by clustering hosts for resiliency, it made sense to shove disks in those same hosts and lash together all the storage within that virtualization cluster into one shared pool.

HCI moved control over storage to the virtualization administrator. More accurately, it moved control of that storage to the virtualization management application, typically the responsibility of the virtualization administrator. The distinction is important at scale; at scale, the virtualization administrator isn't creating workloads one at a time, but by using scripts that act on that virtualization management application to create them by the thousands.

The twin problems early HCI solutions sought to solve were the expense of SANs and the red tape traditionally imposed by those who oversaw them. As with SANs, the world changed shortly after HCI went mainstream, and that brings us to today.

Turn-key enabler

Virtualization was the killer app for SANs. SANs were useful before virtualization came along, but they were what made the most important features of virtualization possible. In a similar manner, HCI was useful before the public cloud got big, but HCI is really what makes modern turnkey infrastructure work.

The existence of the public cloud is the dominant factor in real world IT decision-making today. Like it or not, the fact that end users armed with a credit card can cut through all of IT's red tape and spin up workloads as they feel the need has changed the entire industry irrevocably.

Some organizations have been able to resist providing end users with on-demand infrastructure. Some times there are regulatory requirements that can be interpreted so as to veto end user demand. Some times extant administrators are simply entrenched enough in their organization's political power structures to be able to ignore end users.

As with virtualization, however, on-demand IT is becoming the new normal. Those who revile it are increasingly niche.

On-demand IT doesn't have to mean public cloud. It doesn't even have to mean private cloud; though as private cloud options become cheaper and easier to administer, they will likely also become the new normal. On-demand IT essentially means the ability to provision IT services when requested in a timeframe that doesn't send the user scurrying for a do-it-yourself alternative.

It is entirely possible to build that sort of solution using individual technologies. After all, just like SAN wasn't the only way to make virtualization's key features work, it's really the management tools tying hypervisor, storage, and other layers together that enables today's on-demand IT.

The problem is that welding together discrete components never quite works as well as something designed from the ground up to work as a unified whole. If you source your storage from vendor A, and your compute from vendor B, the management software for vendor C and so on, who is responsible for making sure it all works? If it breaks, does it actually get fixed, or dissolve into a cacophony of finger pointing?

For this reason, HCI has become a go-to technology. It removes a layer of red tape internal to the customer organization and it can remove at least one layer of vendor "partnership" from the creation of an off-the-shelf turn-key solution. It is the path of least resistance.

Choice, and clouds

SANs were around before they became cheap enough, easy enough and important enough to go mainstream. So was HCI. What HCI enables, however, are turnkey IT solutions that can start small (often as few as two or three nodes) and then grow as needed.

A remarkable amount of choice can be tucked under the HCI umbrella. Hybrid solutions for those with modest needs, all-flash offerings for the demanding sort. A team averse to the public cloud could pick up a turn-key private cloud to run a set of workloads for a specific campaign and have the whole thing fit in two rack units. A small business could run quite a bit on even the most modest of today's HCI solutions, adding nodes as they grow.

Unlike early HCI solutions, today's offerings can be quite dynamic. If you need more storage, add a storage-heavy node. If you need more compute, add one that focuses on that. Though I loathe the word "agility" in an IT context, that's exactly what is driving HCI into the mainstream. It's the very same requirement that made SANs data center superstars, back in their day.

2017 will be the year "enterprise clouds" go mainstream. Unlike their complex – and often fragile – precursors, enterprise clouds will focus on ease of use and simplicity. Not only for the end user via a self-service portal, but for the systems administrators operating the cloud.

HCI-based clouds can offer a great deal of choice and flexibility without that nagging feeling that one is welding together a solution with hope and sheer force of will which characterized many of the early private cloud solutions. HCI isn't the only way to achieve ease of use and simplicity in an enterprise cloud, but it is the easiest and cheapest way currently available. For that reason, it is likely to be the most popular underlying infrastructure choice.

Technologies don't win by meeting an objective, quantitative measurement. They win because they let us do more than we did before. Usually because they help us overcome some level of bureaucracy that exists to make previous technologies manageable.

From local disks to SANs and back again; HCI brings us full circle, with a necessary twist. Say what you will about IT vendors, they do keep life interesting. ®


Biting the hand that feeds IT © 1998–2017