NetApp HCI: More converged than hyperconverged?
Separate compute and storage environments inside
Analysis I have come to the conclusion that NetApp's new HCI system is more converged than hyperconverged: in essence the boxes just add compute nodes and networking to SolidFire (storage) nodes.
HCI or CI?
Let's row back a little and use IDC's definition of HCI as the starting point for our thinking.
Hyper-converged infrastructure (HCI) systems collapse core storage and compute functionality into a single, highly virtualized solution. A key characteristic of hyperconverged systems that differentiate these solutions from other integrated systems is their scale-out architecture and their ability to provide all compute and storage functions through the same x86 server-based resources.
Does NetApp's HCI do this?
In its June 2017 launch for Netapp HCI, the vendor did not specify the CPUs or the scaling, so there is no sensible way of assessing the size and number of applications that the system can maximally support. But physically the system is based on 2U shelves sub-divided into 4 half-width bays. These are populated with either compute or storage trays. VSAN is not used to provide the virtualized cross-node block storage.
The system is conceptually based on a scale-out SolidFire Elements all-flash array built from a 1U half-width all-flash node with six SSDs, and running the Elements OS. To this is added a set of 1U half-width x86 compute nodes which run ESXi plus an Elements OS plug-in. The compute nodes don't have local, direct-access storage – or do they?
John Rollason, NetApp
By way of an email interview, I asked John Rollason, senior director, product marketing, Next Generation Data Center, at NetApp, for some clarification.
Rollason: "The initial compute models being offered do not have any local storage other than internal m.2 boot drives. The compute hardware has the capacity to have local storage added (up to six drives) and we might leverage those in future variations of the product."
It certainly looks like an HCI system physically, with its 4-bay, 2U chassis. However, we could think of the architecture as two parallel and side-by-side stacks of half-width storage nodes and compute nodes, with the storage nodes providing a SAN to the compute nodes.
With separate storage and compute infrastructures inside it, this implies that NetApp's HCI is actually converged infrastructure (CI) system.
Rollason: "We’ve discussed this at length with many partners, analysts, and customers – they all confirm that the key to hyperconvergence is simplicity. First generation HCI entrants made architectural design choices that limit their capabilities, and NetApp HCI took a different approach to provide guaranteed performance for mixed workloads, flexibility of deployment, and automated infrastructure at enterprise scale. All these features can easily be consumed in this simple HCI model where it makes business sense for the customer, while still integrating across the entire portfolio of our data fabric solutions. We have changed the game by redefining the expectations for HCI capabilities, all while providing complete simplicity of acquisition, deployment and management."
There can be small, medium and large storage and compute nodes. The storage node CPUs are not specified and nor are the compute node CPUs. NetApp said the small, medium and large compute node configs have 16, 24 and 36 cores for VMs, leaving The Reg Storage Desk to question if there are actually more cores used for other system work, such as executing the ESXi hypervisor and/or the Elements OS plug-in.
Rollason: "Those core counts are accurate. There are not more cores being used. We are just saying they are for VMs because unlike many other HCI offerings all the cores in the compute nodes are available for VM use."
NetApp's HCI system ships in the fourth quarter, which will be after Intel's Skylake and Purley platform launch. The Reg Storage Desk suspects the NetApp HCI compute nodes will use Skylake servers - but what flavour?
Rollason: "The initial offering is Broadwell CPUs. Skylake would have delayed our time to market. The compute nodes will certainly be updated to Skylake with the usual Intel uptick in CPU and less heat/power consumed."
The server manufacturer is not revealed. but The Reg Storage Desk think it is Supermicro. Without knowing the type of servers and the storage requirements per average VM it is impossible to assess how many VMS can be supported by a particular NetApp HCI configuration, and that in turn means that at time of writing there can be no meaningful comparisons with Nutanix, VxRail or other HCIAs (hyperconverged infrastructure appliances).
Mellor: "How many VMs are supported by each type of compute node?"
Rollason: "We have internal numbers we use to help with sizing exercises within engineering. We don’t state these as marketing stats though as it’s always based on a set of assumptions about what a “VM” is. I’ll just say we can run just as many small VMs, or just as few massive VMs, as any ESX server with equivalent CPU and RAM."
SolidFire arrays notionally scale to 100 nodes. Does the NetApp HCI scale to 100 storage nodes and some number of compute nodes? 100 maximally configured storage nodes would provide 100 x 11.4TB = 1,140 TB of raw capacity, 4,400 TB effective capacity. That ought to be capable of supporting an awful lot of VMs but how many? NetApp: compute node details, please.
Mellor: "Is this scaling correct?"
Rollason: "Yes. The HCI storage capabilities exactly match SF AFA scale. The ESX cluster sizes are only limited by ESX maximums. The management software for HCI can manage multiple storage and ESX clusters within the same solution and UI. That combines to allow HCI scale to be in theory limitless. Our general practice is to keep our quality environments on pace with the largest configurations that customers deploy so we are not playing with theory."
Mellor: "Is there a maximum node count irrespective of the number of individual storage and compute nodes?"
Rollason: "In real life the thing that would limit the scale of the solution would be VMware vCenter’s maximum capabilities. NetApp HCI architecture has nothing that limits it from continuing to expand in a single solution so long as a vCenter Server can handle it."
Mellor: "A compute node can have up to 768GB of memory 'for VMs'. Is there additional memory for ESXi and other system software? What is the total raw RAM per compute node?"
Rollason: "Essentially same answer as cores above. That is the memory of the system we are just trying to be clear that it is all available for VMs which is not the case for many HCI solutions."
Mellor: "How will HCI systems be sized for customer workloads?"
Rollason: "As with all NetApp solutions, comprehensive sizing tools will be made available to partners and customers for NetApp HCI."
Mellor: "Will other hypervisors be supported in the future?"
Rollason: "While we cannot comment on roadmap items, we are excited about the flexibility of what we can do with the underlying technology. Today our NetApp SolidFire customers deploy our array technology utilizing VMware, OpenStack, and Containers for next generation use cases such as DevOps and End User Computing. We will remain focused on the needs of our customers as we continue to innovate and change the game with NetApp HCI."
Mellor: "Is NetApp's HCI a CI, really?"
Rollason: "First generation HCI systems made architectural design choices that limit their capabilities," meaning that they integrated compute, storage, hypervisor and networking in one server-based system node which scaled out, node by node.
For example, Nutanix and SimpliVity. This is what we wrote about SimpliVity's OmniCube in 2012:
The box is a 2U rack enclosure containing server, storage and networking hardware plus software that presents it as a virtualized data centre-in-a-box resource.
Rollason: "NetApp HCI took a different approach to provide guaranteed performance for mixed workloads, flexibility of deployment, and automated infrastructure at enterprise scale. All these features can easily be consumed in this simple HCI model where it makes business sense for the customer, while still integrating across the entire portfolio of our data fabric solutions. We have changed the game by redefining the expectations for HCI capabilities, all while providing complete simplicity of acquisition, deployment and management. In other words, whether a system is HCI or not depends upon it having 'complete simplicity of acquisition, deployment and management'."
In that sense FlexPods or VxBlocks could be classed as HCI systems were they to have similar "complete simplicity of acquisition, deployment and management."
This sounds to The Reg Storage Desk like NetApp is saying: "Yes, it is a CI system but it has the simplicity of acquisition, deployment and management of an HCI system, so it is an HCI system."
That may well be the case, and NetApp may well obtain separate compute and storage scaling from it, but under the covers it is a converged infrastructure system, not an HCI system. Does this matter? Readers, what do you think? ®