Why hyper-converged gear needs to go the extra file: Merging blocks and filers to break out of the niche

And how might this be done?

CenturyLink data centre

Backgrounder Hyper-converged infrastructure (HCI) is a popular way to deal with complex server, SAN storage, and virtualization requirements with integrated, scale-out nodes that converge server compute, storage, and hypervisor technology in all-in-one clusterable elements.

The directly-attached storage (DAS) in each node is aggregated into a virtual SAN (storage area network) spanning the cluster – often using VMware’s VSAN. It basically forms a pool of storage coupled to a pool of compute resources for running workloads in virtual machines. Of course, other HCI hypervisors exist, such as KVM that also come with virtual SAN technology.

HCI suppliers of both shades have come to rely on virtual SAN software technology so much that it’s become a well-proven method of providing block-based storage to applications running in virtual machines (VMs) inside an HCI cluster. When I say both shades, I mean those who provide complete hardware and software products – such as Dell EMC with VxRail, HPE SimpliVity, and Cisco’s with its HyperFlex – and the software-only suppliers who run on commodity server and storage computers.

HCI systems are not taking over from external storage, according to IDC's quarterly enterprise storage tracker, which compares total storage: SANs, NAS (network-attached storage), object stores, and so on. We can treat external storage (SANS, NAS, and object stores) as a percentage of all storage, and see how that percentage has changed over the past few quarters.

The external storage percentage appears to be holding steady, with no sign of an HCI incursion into its market space.

Research house Gartner expects HCI to grow at a CAGR of 48 per cent to hit $10bn by 2021. This is very good, though it does seem that HCI could be occupying something of a niche. To break out of this corner, it could use an extra added factor to accelerate its sales.

Could that be file-based storage – something enterprise on-premises IT shops employ in heaps? As much as SANs provided shared access to external block storage, NAS arrays provide shared access to external file storage. An HCI niche breakout could come if the storage technology that the HCI world initially ignored, file access, were to be added to the HCI mix and deployed widely.

How might this be done?

A block-based SAN system deals in blocks of data on the underlying physical storage medium: flash-based solid-state drives (SSDs) or hard disk drives. It addresses them in groups, called logical unit numbers or LUNS, and applications can connect to and access these LUNs. A file-based NAS also deals in blocks of data on the underlying physical storage medium, but organizes them into groups called files. These are themselves organized into a hierarchy of folders that exist in a file system. System administrators and users "mount" file systems and then access the files within them. They will us either NFS or SMB/CIG+FS protocols to do so.

The two access protocols, block and file, are separate and distinct. Unified external arrays, such as Dell EMC’s Unity and NetApp’s ONTAP, provide both block and file access but keep the two pools of data separate. Nutanix's HCI offerings also provide block, file – via SMB and NFS – and, coming soon, object storage. Generally speaking, truly unified file and block HCI deployments are uncommon.

So, how can file-based access be added to a HCI storage infrastructure? It would have to be an addition, an overlay, to an existing HCI or a complete, ground-up redesign. Alternatively, file storage could be added to an HCI system by linking it to a NAS array. Hold that thought for a moment.

Adding file storage as an overlay to an existing HCI with virtual SANs means partitioning some of the virtual SAN space and accessing it with file protocols. We can envisage a VM in the HCI providing file storage to other VMs running applications needing file storage.

The file system virtual machine would take the VMDK (Virtual Machine Disk) presented to it by VMware’s vSphere, for example, and create a NAS for delivering NFS and SMB/CIFS shares to other virtual machines. It would provide a mountable file system for applications, supporting the NFS and SMB/CIFS protocols as appropriate.

The file system virtual machine is basically software-defined storage (SDS) and would best come from an SDS vendor, not a vendor providing SDS that is actually linked to or restricted to that vendor’s hardware. That way a chosen HCI file system SDS product would, in theory, run on any HCI system.

Such an SDS might also provide object storage as well, enabling a convergence of block, file, and object storage on one underlying physical resource; the HCI node’s converged DAS capacity.

Marrying it all up

Management of this lot is clearly a factor. It’s not at all satisfactory to try to marry the file and block worlds while having separate management environments.

Let’s stick with VMware. Ideally, the file system virtual machine SDS product would integrate cleanly with VMware’s vCenter console to deliver unified management without the need to change any of the existing SAN/HCI systems. To achieve this, the HCI system with file services would need to integrate with the general file services environment within which NAS filers operate, meaning the basic NFS and SMB protocols and the wider Windows file infrastructure.

What we’re talking about here is using Microsoft’s standard Active Directory (AD) and other Lightweight Directory Access Protocol (LDAP) tools – bedrocks in terms of authentication and authorization in the enterprise – but, also, single sign on (SSO) and the Storage Spaces Direct (S2D) facility.

HCI file system users could be synced with an enterprise’s internal LDAP/AD directory so the right users are provisioned and managed on each server automatically. You can also create ID mappings for AD users and groups, helpful for NFS clients in mixed NFS and SMB environments.

Alternatively, enterprises could employ user authentication and permission controls via SSO integrations that could be easier to provision, use, and administer. This may involve layering third party software on top of existing cloud service interfaces.

Never going to let you down

Another benefit to AD integration is the relative ease of building high-availability storage systems based on local storage resources using the Storage Spaces Direct tool in Windows Server 2016. Servers and HCI clusters can be added easily to domains via AD, with privileged access management (PAM) for AD providing an inbuilt solution for storage and network administration that eliminates the need for third-party tools.

In general, things like AD and LDAP, user management, authentication, and tight permission management are “must-haves” when it comes to properly replacing traditional NAS appliances with a file services facility that’s integrated with HCI systems.

Other benefits of the SDS on HCI approach include embedded tools and features that support encryption, backup/disaster recovery, high availability, and efficient storage migration between different on and off-premises hosting facilities.

There can be caching algorithms for higher performance, lifecycle management, cloning, provisioning, snapshotting, data efficiency services such as inline data deduplication, replication, copy-on-write file system, and 256-bit checksumming of data.

As far as RAID is concerned, the HCI system should be responsible for hardware management and hardware resiliency thereby allowing the SDS file services system to focus on its file services without having to be concerned about how to handle underlying hardware failures. Having the file services layer do software-based RAID would adversely affect performance and be a retrograde step.

Some systems also supply capacity provisioning templates, analytics, and orchestration capabilities to support flexible data centre workflows – enabling the addition of NFS/SMB file services to virtual storage area networks under a single web client interface.

Combining file and block access in an HCI system gives enterprises a single virtual pool of accessible storage resources. It can be achieved inside a unified management domain as well.

That means the existing simplified and clean HCI physical infrastructure scheme can be applied to both block and file storage environments, lowering overall storage costs, by enabling separate file storage acquisition, management and support costs to be avoided.

This is an attractive proposition for HCI adherents, and with widespread adoption perhaps enable HCI to break out from the market niche it occupies and grow even faster than Gartner’s gurus have anticipated. ®

Biting the hand that feeds IT © 1998–2018