Original URL: http://www.theregister.co.uk/2011/08/18/nutanix_storage/

A first glance into Nutanix storage

Once over lightly

By Chris Mellor

Posted in Storage, 18th August 2011 07:52 GMT

What can we expect from the Nutanix Complete Cluster in a storage sense? Here's an instant review of what the documentation tells us.

Context

The Nutanix product does exactly what HP's P400 VSA does, only more so and in a full-bodied way. It virtualises direct-attached storage ( DAS) across servers to function as a storage area network for VMs in those servers. It uses dedicated flash for metadata and active data, disk drives for less active data, and 10gigE as a cluster interconnect.

The trend of bring storage and compute resources closer together was pioneered, among others, by Sun, with its Honeycomb NAS product. The trend of virtualising DAS into a SAN was pioneered by iSCSI SAN pioneer LeftHand Networks and its Virtual Storage Appliance (VSA), not to be confused with VMware's VSA.

Nutanix is not alone by the way, in creating a virtual SAN out of server flash and DAS disks. StoneFly has a virtual SAN appliance that uses Fusion-io PCIe flash drives plus the host server's hard disk drives.

It is ironic that Nutanix is using EMC-owned software – VMware – to build compute-plus-storage blocks aimed to replace EMC VMAX and VNX SAN storage arrays. Interestingly EMC could do a reverse-Nutanix by running app VMs on spare VMAX engines... in theory.

Three storage tiers

Nutanix says it has three kinds of storage media per node: 320GB Fusion-io PCIe flash, 300GB Intel SATA interface solid state drives (SSD), and five 1TB, 7,200rpm Seagate disk drives.

Why does it have two kinds of flash media? A Nutanix spokesperson said: "The 300GB Intel SSD is the boot SSD for each host. It also serves as swap space in case of high density VDI workloads. The PCIe SSD is the main workhorse of the system - the HOT (Heat Optimized tiering) Cache, Flash Store, metadata sits on the PCIe SSD."

Checking Fusion-io product specs, there are two products that could be used: a 320GB 2-bit muti-level cell (MLC) ioDrive, or a 320GB faster single-level cell (SLC) ioDrive Duo. It's actually the MLC ioDrive

Checking Intel's SSD products there appear to be two choices as well: the 320 – a 300GB 2-bit MLC desktop/notebook SSD – or the 710, again a 2-bit MLC product but using better-class NAND. Both have a 3Gbit/s SATA interface. Nutanix says the Intel 320 SSD is used

We're told the disk drives are 1TB, 7,200rpm SATA drives from Seagate. This is a 1TB Constellation drive according to Nutanix, a 4-platter drive as we understand it.

Flash cache use

Nutanix technical documentation (PDF) says its controller software, which runs as a VM, stores VM metadata and primary VM data "both in a distributed cache for high-performance and in persistent storage for quick retrieval". That suggests two tiers of flash.

The flash also stores Nutanix SOCS (Scale-Out Cluster Storage) metadata, with SOCS proving the pooling of storage across server nodes and the presentation of this as iSCSI-accessed virtual disks (vDisks) to VMs. The vDisks can span both flash and disk drives in a node, and multiple Nutanix server nodes as well. Every node runs a SOCS controller inside a VM.

LUNS are not mentioned by Nutanix and vDisks appear to be dedicated to individual VMs.

Flash storage is set aside for I/O-intensive functions and includes "space-saving techniques that allow large amounts of logical data to be stored in a small physical space". These are unspecified and could include compression and/or deduplication.

The software moves less frequently accessed data off the flash and onto the SATA hard disk drives. Administrators can "bypass SSDs for low-priority VMs".

Nutanix divides its storage into FlashStore and DiskStore and there is HOTcache with HOT standing for Heat-Optimised Tiering, and the cache being SSDs. When a VM writes data that goes into HOTcache and in a later background process is passed to the Scale-Out Controller Software (SOCS) which does the cross-server node storage pooling and virtualisation.

We're told HOTCache uses a sequential data layout for high-performance "even if workloads from VMs get mixed into a random workload". It keeps one data copy on the local SSD and a second copy on a different node to cover data loss.

If data becomes hot again it is moved back into the FlashStore.

Curator is Nutanix software providing a distributed data maintenance service. It is a MapReduce-based framework executing data management functions in the background and, among other things, moves "cold data to lower tiers (for Heat-Optimised Tiering)". Being pedantic, this means to more than one lower tier, and so suggests it moves data from the top tier – the Fusion-io flash; to the second tier – the Intel SSD; and then to the third or bottom tier – the Seagate SATA disks.

Is this better than a real SAN?

Nutanix says its Complete Cluster takes up less space and is more scalable more flexibly than a standard SAN. It is contrasting its product with both iSCSI and Fibre Channel SANs and suggesting that users not bother with physical networked iSCSI or Fibre Channel anymore, or even with FCoE.

One limitation is that the Nutanix product is server-heavy for bulk data storage. If you are running a big data app as a single VM and it is addressing more than 5TB of data then you have to buy another node – although you don't need the compute CPU resources of that node, only the storage handling cores out of the 12 available.

There is no such thing as a Nutanix expansion pack, adding say, a boxful of disks to provide more storage capacity. The idea of having a Nutanix node supporting 20TB or more of capacity would give the product a much wider appeal.

This limited storage capacity is exacerbated by the use of 1TB 2.5-inch drives. Using 3TB 3.5-inch drives would provide more capacity, if you could cram enough of them into the Nutanix enclosure. For the moment though, Nutanix's approach is one node type fits all.

We are told that Nutanix users get a 40 to 60 per cent CAPEX reduction through not buying a physical SAN infrastructure. The management should be better since compute and storage resources are managed through the same pane of glass, so to speak.

So the generic pitch is: the same performance as a Fibre Channel SAN for less cash and more efficient use of data centre space. It all seems promising and worth a look through Nutanix sales, there being no channel. ®