Original URL: http://www.theregister.co.uk/2013/12/20/nutanix_using_mapreduce_to_improve_cluster_performance/

It's a CLUSTER-PLUCK: Nutanix uses MapReduce to polish performance

Do-it-all boxen are not components, they're end-to-end solutions. Hear that, channel?

By Chris Mellor

Posted in Storage, 20th December 2013 14:58 GMT

Analysis Converged server-system startup Nutanix, which sells all of its products through the channel, says hypervisors are the virtual sheet metal of the commodity server game. You won't need separate storage arrays, instead you can get a storage pool across many servers by virtualising their directly attached storage (DAS).

Nutanix is led by CEO Dheeraj Pandey, who says business is growing and the firm is adding staffers. Were there was just one team in Europe 18 months ago, there are now 15. The firm is adding 25 folks every month and is up to 400 employees. Pandey says his firm's products run the gamut from compute heavy lifting to storage-heavy nodes.

Nutanix's software sits above the hypervisor and is agnostic, working with VMware, Hyper-V and KVM. The firm's pitch is that while VMware's VSAN causes hypervisor lock-in, Nutanix's virtual SAN does not. Pandey says Nutanix's virtual storage is enterprise class, whereas competing products cannot replace NetApp FAS arrays or EMC's VNX/VMAX, or even's HP 3PAR range.

Nutanix and MapReduce

Pandey says customers should do what Google and Facebook are doing, disaggregate the data centre storage hub, collapse the data centre and do without storage arrays. To do this you have to be able to scale the servers in your virtual storage pool and you must be able to scale metadata storage to be able to do that. Nutanix makes every server a metadata server in its architecture - in a 100-node cluster there are 100 metadata servers.

MapReduce technology is used to find the metadata needed for storage transactions: "We use MapReduce for everything; adding a new node, recovering a failed drive, doing information lifecycle management across flash and spindles, out-of-line compression and deduplication."

Nutanix is a big data app, he says, that exposes NFS, SMB, iSCSI and so forth for legacy apps: "Now that we have built a storage fabric, we're focused on building an analytic fabric for the data centre... We're building a Google-like structure in the data centre."

This is a fabric to analyse a data centre's own operations and make them more efficient; it's inward-looking and uses machine-generated data from the data centre's own devices.

As snug as babies in COTS

Pandey is a profound believer in the virtues of COTS (commercial, off-the-shelf) hardware - and thinks suppliers moving away from server COTS, like HP with its Moonshot server initiative, are doing so to avoid paying what he terms a "VMWare tax".

Moonshot is, he says, "basically the antithesis of virtualisation. You don't need VMware if you're using very low-cost hardware," made from, for example, ARM and Atom CPUs. But ARM is barely 64-bit today and for HP it is a true moonshot, being up to five years out. But: "The whole focus is to reduce the influence of VMware," through using low-powered servers (LPS).

He's not a believer: "Conceptually I don't believe LPS stands a chance; it's hardware instead of software."

And, of course, if HP's moonshot is successful, and Pandey's supposed consequence of hypervisors falling away comes to pass, that leaves Nutanix with its reliance on hypervised COTS hardware in an unattractive place.

Nutanix competition

Simplivity competes with Nutanix is developing and selling converged server/storage/networking boxes. Pandey's very interested in what's under the wrappings: "Do they have a clustered file system? They've been selling for a year and no one knows if they have a clustered file system."

What about an IPO?

"To us a $1.5bn IPO is not interesting. It's a good start but we are not interested."

He respects Nimble, two years older than Nutanix, and its successful IPO: "Nimble is very good people," he says. Pandey claims there are three storage start-up companies that have broken away from the pack: Nimble Storage, Pure Storage and Nutanix.

Legacy companies like Cisco, Dell, HP and IBM with their legacy software won't be able to catch up, he opines, because it's got ground-up designed software, re-invented server DAS as storage, and its software runs on COTS.

Nutanix and analytics

Ideally a virtual machine (VM) should know where its data lives - in flash, on disk or in the cloud. With SW and a control path, applications can know more about data placement. So record the access patterns, hits and misses on spindles, and correlate with performance in real-time of virtual switches, VMs, virtual disks, physical disks and so on.

It has to go beyond the LUN: the focus is on the VM - shades of Tintri here - and its compute, storage and networking history and performance,

It's analysing the IT infrastructure itself, using machine-generated data. How is it made available to data centre admin people? Pandey thinks SMBs could share such data in the cloud whereas enterprises will prefer to keep it on-premise.

Nutanix will deliver the first iteration of its data centre infrastructure-focused big data analytics in the February/March 2014 timeframe with v4.0 of its product. It will be decoupled from the Nutanix cluster and run as a bunch of VMs that will analyse the Nutanix cluster. "Over time we'll plug in outside the data source, and run load-balancers, etc."

Pandey mentions the idea of self-healing and says software-defined storage will be more usable if it's easier to consume.

He says Nutanix sees itself as an end-to-end solutions company and not a component supplier.

The takeaway here is that Nutanix will use big data analytics techniques to make its clustered converged server/storage nodes run better, more efficiently, than any other COTS server/storage suppliers' kit and, we suspect, provide a better data centre environment in a total cost of ownership sense than the legacy vendors. ®