Cloud box does virtualization sans SAN

Practice safe SOCS

Boost IT visibility and business value

Cloud-computing appliance maker Nutanix is tackling a problem that has dogged the deployment of virtual servers and desktops: all the key hypervisors require storage area networks and centralized storage.

To overcome this limitation, Nutanix has created a virtualized controller that implements a clustered file system and embeds it in a cluster-compute appliance – the compute nodes essentially become a virtual SAN.

The beauty of the Nutanix Complete Cluster, says company cofounder and CEO Dheeraj Pandey, is that the server virtualization hypervisors that run atop the appliances still think they are talking to a SAN – all the nifty high-availability, snapshotting, and live-migration features baked into these hypervisors and their management consoles continue to work just as they did before.

By banning the SAN, says Pandey, Nutanix can simplify the rollout of compute clouds at large enterprises that are familiar with complexity and high cost, and therefore have been happy – however begrudgingly – to invest in SANs. In addition, it can make cloudy infrastructure more affordable for small and medium businesses that want high-end virtualization features that used to require SANs.

Nutanix was founded in 2009 by file-system experts from Google and cluster experts from Oracle. Pandey managed the early incarnations of the Exadata product and also created the storage engine for Oracle's database. More recently, he was vice president of engineering at Aster Data Systems, which was acquired by Teradata in March of this year for $263m so the data warehouser could get its mitts on the company's nCluster hybrid row-column database and SQL-MapReduce big-data chewer.

Nutanix cofounder Mohit Aron also hails from Aster Data – he was the chief architect at the firm and did a lot of the work on nCluster. Prior to his stint at Aster, he was at Google, leading the design and development of the Google File System, the original incarnation of Google's distributed file system that supported its MapReduce big-data crunching techniques.

Pandey and Aron were joined by Ajeet Singh as the third cofounder and the company's chief products officer. Also from Aster Data, where he was director of product management, Singh had previously been part of Oracle's early cloud computing efforts. The three cofounders got their seed funding from private investors in May 2010, and pulled in $13.2m in Series A funding in April 2011 from Lightspeed Venture Partners and Blumberg Capital.

How to fool an unsuspecting server

The Nutanix Complete Cluster begins with the basic building block in today's data center: a two-socket x64 server. Nutanix currently sources two-socket tray servers from Dell and Super Micro, which cram four nodes into a 2U rack-mounted chassis – they plan to source machines from HP once it delivers on-board 10 Gigabit Ethernet ports.

Pandey says that it only supports its software stack on select hardware configurations because Nutanix has to do a lot of tuning on the server, flash storage, and disk storage that goes into its appliances – customers can't just run the Nutanix stack on whatever servers they have lying around their data center.

Each server node in the cluster is configured with two six-core Xeon 5600 processors, with eight cores allocated to run hypervisors and virtual machines, and four cores allocated to run the Nutanix virtual storage controller, called Scale-Out Converged Storage – SOCS for short. This controller virtualizes a pool of Intel and Fusion-io solid state disks and Seagate SATA drives, and presents virtual machines with block and file I/O access to data spread across these disks inside the cluster.

"The architecture is hypervisor agnostic," explains Pandey, with VMware's ESXi 4.1 hypervisor being the first one to get support on the cloud appliance. "ESX thinks it is talking to SAN storage and it is not."

Nutanix cloud appliance architecture

The Nutanix Complete Cluster can fool multiple servers into thinking they're connected to a SAN

The storage cluster implemented by SOCS on the compute nodes uses 10GE ports to link the nodes to each other, and has Gigabit Ethernet links to provide access to VMs and their workloads. The 10GE link is necessary to make use of ESXi hypervisor features such as live migration, high availability, fault tolerance, and distributed resource management, which create a lot of chatter on the network. On a SAN, can you just flip some pointers to make a VM's file point to a different physical server during a live migration, but on a Nutanix appliance, you need to move data.

Stirring up some secret HOT sauce

The SOCS storage software includes Cluster RAID, which stripes data across disk drives within a server node for high performance. What Nutanix calls Heat-Optimized Tiering cache, or HOTcache, caches data in each cluster node on a local SSD and also puts a copy on a different node in the cluster as a backup. SOCS also includes a distributed metadata service, called Medusa, that spreads the metadata around to multiple nodes for performance and fault tolerance reasons. "The secret sauce in all of this is the metadata, and it is globally addressable," says Pandey.

Nutanix cloud appliance

The Nutanix Complete Cluster appliance, complete with massive logo

SOCS sports a distributed data maintenance service called Curator that uses MapReduce techniques to figure out what bits of data are being used by what VM when and where, and automatically migrates the coldest data to disks and the hottest data to the Fusion-io and Intel SSDs.

Curator also rebalances data when nodes are added to the Nutanix cluster and moves data along with a VM when they are live-migrated, thus keeping data used by a VM as close to it as possible. SOCS includes snapshotting features like a real SAN, plus filers (called QuickClone) as well as thin provisioning and converged backup, which makes backups of files onto the file system and allows them to be pushed out to external online backup services.

Name your hypervisor – eventually

Each Nutanix cloud server node has two Xeon 5600 processors, one 320GB Fusion-io flash disk for metadata and data that's plugged into a PCI Express 2.0 peripheral slot, one 300GB Intel SSD housing system software that's slid into a SATA disk bay, and five 1TB Seagate 2.5-inch SATA drives for customer data.

Each node comes with a base 48GB of DDR3 main memory, which can be expanded to 192GB as workloads dictate. At the moment, early customers are using 10GE switches from Arista Networks and Super Micro to link Nutanix appliances together, but any 10GE switch should work.

Nutanix cloud appliance exploded view

Exploded view of a Nutanix cloud appliance

A four-node block with eight processors and 48 cores that can be allocated to VMs, 192GB of memory (expandable to 768GB), 1.25TB of user-accessible SSD storage, and 20TB of disk capacity costs $115,000. Too much? Well, there's also a starter kit with three server nodes in the Nutanix cloud appliance has a slightly discounted price at $75,000.

If you need more oomph, a full rack of Nutanix appliances – 18 blocks and 72 server nodes totaling 576 cores, 3.4TB of main memory, 18TB of SSD capacity, and 360TB of disk capacity – will run you just over $2m.

That may sound a bit pricey, but Pandey points out that compared to a rack of servers and external SANs, this setup costs 40 to 60 per cent less and delivers somewhere around ten times the bang for the buck because of the flash tiering and other SOCS goodies.

The Nutanix Complete Cluster is available now. Pandey says that the company will listen to customers about whether it should next support Microsoft's Hyper-V or Red Hat's KVM hypervisor, but Nutanix will eventually support both, as well as ESXi from VMware. Xen will also no doubt eventually be supported – if customers ask for it. ®

The essential guide to IT transformation

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
US regulators OK sale of IBM's x86 server biz to Lenovo
Now all that remains is for gov't offices to ban the boxes
Flash could be CHEAPER than SAS DISK? Come off it, NetApp
Stats analysis reckons we'll hit that point in just three years
Oracle reveals 32-core, 10 BEEELLION-transistor SPARC M7
New chip scales to 1024 cores, 8192 threads 64 TB RAM, at speeds over 3.6GHz
Object storage bods Exablox: RAID is dead, baby. RAID is dead
Bring your own disks to its object appliances
Nimble's latest mutants GORGE themselves on unlucky forerunners
Crossing Sandy Bridges without stopping for breath
prev story


5 things you didn’t know about cloud backup
IT departments are embracing cloud backup, but there’s a lot you need to know before choosing a service provider. Learn all the critical things you need to know.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.
Rethinking backup and recovery in the modern data center
Combining intelligence, operational analytics, and automation to enable efficient, data-driven IT organizations using the HP ABR approach.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.