VMware vSphere Enterprise Plus: An El Reg deep dive
Trevor Pott feels the big business end of virtual machine giant
Review Given the plethora of virtualisation kit on the market, VMware customers – and potential customers – just want an answer to this very simple question: are VMware's offerings worth the money? A truthful response is fantastically complicated.
VMware has many levels of offerings; what's more there are a lot of different companies out there, each with their own circumstances. In part one of my VMware review, I am going to focus on VMware's enterprise suite of offerings.
Out of the box: the initial install
VMware's vCenter Server comes in two flavours: a set of applications for Windows and a Linux appliance. Not to put too fine a point on it, but the installer for Windows is broken. If you have ever wasted a day cursing and trying to install the complete suite of Microsoft's System Center applications (or Office Communications Server, back in the day) then you have some understanding of how fragile this installer is. Read the best practices guide and do not deviate.
If you install fresh on Windows, the installer is likely to melt when it tries to install profile driven storage (fixes are listed here under "known issues"). If you are upgrading, be prepared for certificate issues (see this discussion, it contains fixes). vCenter Server really doesn't like Server 2012, so for sanity's sake stick to Server 2008 R2.
The vCenter Linux appliance doesn't contain all the bits that are part of the vSphere package. The critical bit that's missing is vSphere Update Manager. Orchestrator and Operations have their own appliances, but the Update Manager still needs to be installed on Windows. This is a shame as getting Update Manager installed and working properly has proven to be an incredible pain. I ended up nuking my install and starting over, slavishly following the best practices guide and poring over PDFs before I got it working.
While the Linux vCenter appliance is far easier to get up and running than the Windows-based version, I had the devil's own time getting the vSphere appliance to join to the domain. A round of faffing about turned up the answer: the domain name handed out by the DHCP server was not the same as the domain I was trying to join; dhcp.company.local instead of infrastructure.company.local.
The fix was simple: enter network configuration on the appliance and choose "set a static IP"; this allows you to manually set the hostname. Fix that and Bob's your uncle. (This is an old bug in Likewise; I'm amazed it's still an issue in the vSphere appliance.) Once that issue had been overcome, the vSphere appliance worked quite well. VMware has promised to look into the issue.
Once I got the appliance joined, I re-ran the setup wizard and was able to set up Active Directory as my authentication source with no trouble. The Single Sign On server probed my active directory topology and found all the domains in it. I only wanted to be able to authenticate against the one domain in the forest, however, and found removing the unwanted options straightforward.
From a performance standpoint, vSphere is an absolute pig. My first attempt saw me install vSphere on an AMD Athlon II X3 400e. I wanted something low power for what I thought should be a low-usage system; I was going to manage 20 sockets worth of hosts and fewer than 500 VMs. How much power could it take? Before I even got to 50 VMs, doing anything completely flattened the poor Athlon system.
Using such outdated hardware for the initial test is probably not all that fair, but I did want to see how vCenter Server would work in an edge-case upgrade scenario. The answer: better than System Center 2012 on the same hardware, but not by much. For a more realistic test, I tossed the vCenter Server appliance onto a node in my Supermicro FatTwin cluster (with two Xeon 2680s per system). The vCenter Server appliance doesn't even make those systems blink.
The widgets you cannot live without - and why
Once over the initial install hump, the functionality you get from these applications is enormous. vCenter Server is non-optional; this is the widget you need to make all the other widgets work. In vSphere 5.1 it comes with a Single Sign On (SSO) server; this is absolutely critical considering how many applications make up the vSphere stack.
The SSO server can provide you either a centralised authentication system against its own database, or (far more realistically) it can integrate with Active Directory. This gives you a robust and reasonably granular means to lock down administrative permissions within the vSphere suite of applications by tying permissions to extant Windows security groups instead of individual users.
I have a site with 13 hosts to upgrade to vSphere 5.1 next week and I cannot imagine doing this without Upgrade Manager. Half the hosts don't have IPMI, and I am not fond of the idea of jabbing USB sticks into things while standing in a freezing server room. Even for my small deployments, Upgrade Manager is more civilised; it is absolutely essential if you plan to do anything at scale.
vCenter Orchestrator seems pretty niche to me; it lets you create "workflows" which are essentially macros relating to the spawning and configuration of VMs. That's neat – especially if you are doing Things As A Service – but I feel it has a few rounds of evolution to go before it is intuitive enough to replace my extant workflows. The bigger your data centre, the more it is likely to appeal to you.
vCenter Operations is a touchy subject. It's a decent monitoring app; it is filled with intelligence and contains valuable features ranging from chargeback to cost metering to capacity management and prediction. Despite all the awesome features, it is still – at its heart – a monitoring app. Monitoring apps are miserable, finicky, time-consuming things to configure. To really get all your gizmos working with the thing can be the stuff entire careers are built on.
Operations is a good product. It's even worth considering as a replacement for your existing software if that time has come. Like Orchestrator, however, it is only going to be the selling point in a minority of cases. It is the core functionality of the vCenter Server itself that most of us really care about and that is what I'll dive into below.
High availability, fault tolerance, data protection and replication
VMware's take on high availability (HA) – if a host dies, the VMs that were running on it are restarted on another host – is something I consider today to be the minimum entry into "proper" business computing. The folks at Stratus and I have had some disagreements about this, but I ultimately trust VMware's HA over most alternatives. The new is phased in with the old; we run heterogenous systems in the real world and so software HA such as that provided by VMware has become critical.
HA works with every bit of hardware and software I have thrown at it so far, and nothing I oversee is truly so vital that people will die if there are a few hours of downtime for VM reboots over the course of the year. That said, HA isn't without faults. When a host dies, everything in RAM is lost; the VM comes up the same as if it were a physical system that had been unexpected rebooted. That's bad for databases.
For those needing a little bit more stability, VMware offers fault tolerance (FT). Where HA simply boots the crashed VM back up on a new host, FT is a "continuously available" technology. Essentially, two identical VMs are created on different hosts and are kept in RAM/storage lock-step.
The last time I really got into this was with vSphere 4.0. FT back then was finicky regarding the combination of OS and hardware, and ultimately it was kind of crap. vSphere 5.1 seems to have solved this; I haven't been able to break it yet. The downside to vSphere FT – and it's a doozy – is that any FT VM is limited to a single core. Ouch. Mind you, I still can't do native FT on Hyper-V at all yet, so a bonus point for VMware here. (Though Stratus has a widget for that.)
There's also vSphere replication to be considered; designed as a WAN technology, this replicates a VM from site A to Site B with no more than 15 minutes of lag. Replication allows me to back up VMs that don't change much: typically web servers or VDI master images. These are the kind of things that connect to a centralised file store or database anyway; for the workloads, replication and snapshots works just great as the lazy man's offsite backup. While databases need to be kept in lock-step, that is typically done at the application level, not the hypervisor.
For real backups, VMware offers vSphere Data Protection (VDP); it's EMC's Avamar without the blue crystals. It works quite well; the deduplication levels are amazing. If you want application-level integration for SQL and Exchange and so forth, you'll need the advanced version.
All in all, vSphere 5.1 seems to offer functional availability and backup technologies for just about every scenario you can imagine. What's more, this is all standard fare now; if all you are buying is vSphere Standard, you can still do all of the above. If time is money to your business – and it is for most – then the argument for vSphere Standard has just been made.
There are, however, limits to how far you should be trusting in the "hypervisor on commodity hardware" approach to the world. If lives (or lots and lots of money) depend on the availability of your network, then it really is time to talk to Stratus. They can overcome the FT single-core limit and they work closely with VMware to do more than VMware can do with the hypervisor alone. VMware will run everything from a technology website to an oil and mining operation just fine. Your 911 call centre or millisecond-sensitive stock trading system still needs Stratus' special sauce.
Hand over the network switch's keys to software robots
Software-defined networking is the feature bump of the now. Having spent a decade building a software suite that renders a lot of operations management software – not to mention systems administrators – moot, VMware is leading the charge to commoditise enterprise networking. VMware's vision is that of a software-defined data centre; this is territory I've covered before and don't see a particular need to rehash.
Even if you don't buy into VMware's grand vision of the future, however, network IO control and distributed switches are things that become indispensable at scale. When your network gets large enough, you start actually using the features on your switches. VLANs are everywhere today, but rate limiting, 802.1p tagging, and teaming/link aggregation all start to be considerations as soon as your network has even one oversubscribed link.
vSphere's network IO control allows you to configure this per VM while the distributed switch ensures that your settings move from host to host with the VM. It's an interesting toy at the scale of my lab. It's useful when I start working with 25 hosts. Working without it would be maddening long before I hit 50 hosts. The networking features make Enterprise Plus worth considering for midsize organisations, and essential for large enterprises.
Storage - appliance, APIs, distributed resource scheduler (DRS) and profile-driven storage
If reshaping the networking landscape is what is going to occupy us all for the next few years, innovations in simplifying and amplifying storage are at the core of the next major wave of changes in IT. VMware and storage have a complicated relationship; majority ownership by big daddy EMC means that VMware walks a fine line between building bleeding edge functionality into their product and invalidating the business model of their parent company.
This has had mixed consequences. An example of how this has worked out well is the VDP backup software discussed above. VDP is EMC technology, and good tech at that. The other side of this particular coin is the vSphere Storage Appliance (VSA).
Fundamentally, it is a truly excellent piece of software. It takes a bunch of local storage (significantly cheaper than centralised stuff) and lashes it together into a distributed storage array that behaves like centralised storage. It creates a local RAID within each node and it mirrors that information to another node within the cluster; great stuff.
Unfortunately, it only supports clusters of three nodes (though you can have as many storage appliance clusters as you want). While this is peachy for small deployments, this is nowhere near the kind of scalability we need in order to challenge the high cost of traditional SANs. VMware has been discussing a grown-up version of this (dubbed vSAN), however, we've only seen a product demonstration so far; it may never be a real product.
Excepting in rare circumstances – such as the desire or requirement to punt a small pod of three servers into a branch location somewhere – the Storage Appliance is functionally useless to enterprises. Enterprises are already likely heavily invested in SANs, so that isn't a big deal. For mid-sized companies, it is a frustratingly tantalising technology. One of the most frequently expressed opinions I have heard from this group is the desire to see the storage appliance capable of scaling as their business grew; they have no interest in SANs.
This isn't to say that VMware isn't innovating in the storage arena; the Storage API can do wonders. Profile Driven Storage (PDS) alone is worth the cost of Enterprise. In a nutshell, PDS allows you to connect up all sorts of different storage and designate the data stores to different tiers. Your VMs can be assigned to these tiers and they will thus only be moved to storage of the quality and speed for which they are optimally designed. Storage DRS is the widget that automates all of that.
So you bought a couple of nodes last month - what happens when you order hundreds?
vMotion, Storage vMotion, DRS, distributed power management (DPM), host profiles and auto deploy are what I call "coalface tech". They are not technologies that reduce capital expenditure (capex) or provide a layer of risk management. They are the operational (opex) features that make the lives of systems administrators better. Selling opex to the pointy-haired boss is traditionally much harder than selling capex.
As the widgets that let us move VMs from one system to another without turning them off, vMotion and Storage vMotion should sell themselves. Hardware dies. Sometimes you have to upgrade a system to get better performance. In both cases the ease of vMotion saves an awful lot of systems administrator time. Your time – and the cost of system downtime – is worth more than the licences.
Indeed, I'd go so far as to say it justifies virtualisation all on its own, without any of the consolidation arguments that have been used for the past decade. Then again, I'm a sysadmin; I naturally favour opex arguments over capex ones.
DPM takes the chore of moving VMs to the smaller required number of hosts and powering off the unneeded hardware off of our hands. DPM could be sold as capex – lower power bills - but in reality, we'd have just written scripts to do what it does anyway. DRS is the automated load balancer; a function that VMware admins not blessed with Enterprise licences rapidly tire of. Even with the small networks I run, load balancing takes up far too much of my time; at scale, the opex savings could prove quite significant.
Host profiles and auto deploy disconnect the rack monkey from the hypervisor. Slap the first host onto your network and configure it the way you want. Save that configuration as a host profile; repeat for every host type you have. Set up auto deploy and not only will vSphere install the hypervisor on each new server it discovers, it will install the correct configuration as determined by your pre-set host profile. This isn't worth the cost of the licence if you only buy a host or two a year; it becomes mandatory if you start buying them by the hundreds.
Is vSphere 5.1 worth upgrading to?
If VMware holds true to form, there will be another new version of VMware launched at VMworld in August. There is intense pressure from Microsoft's "we finally don't suck" release of System Center 2012 SP1 at the high end combined with "by the way, it does everything (from PowerShell) and it's free" Hyper-V Server at the low end.
I can't see 2013 ending without a major version release from VMware. If VMware doesn't do a big update this year they are in for a world of hurt. You know this, I know this; we have to assume the brass hats at VMware know this. So why not hold off until August and get the shiny new thing?
If you have vSphere 5.0, then this is the correct path for you. The biggest upgrade 5.1 offered over 5.0 was banishment of the accursed vTax. If you already bought in to 5.0, you've already paid it, if it were to affect you. The rest of the feature upgrades are incremental; don't waste your political capital fighting the upgrade battle today. Save it for Q4 and buy the next version of the vSuite (after someone else has walked through the minefield and found all the bugs for you).
If, however, you have 4.0 or 4.1, there are good arguments to be made for jumping on 5.1 now. The biggest being the "known quantity" factor. 5.0 and 5.1 didn't differ overmuch, so you have one and a half years of companies marching resolutely through the minefield ahead of you. The feature bump is a nice step up and if you want to play with Windows 8 or Server 2012 at all, you need to take the plunge.
If you are using ESX 3 or 3.5, then you are so far behind at this point that the game changes yet again. There are good arguments for upgrading now, but if you're that far behind then you are obviously part of an organisation that doesn't upgrade all that often.
I've done an upgrade from 3.5 to 5.1, and it's almost as much trouble as moving to Hyper-V would be. You can follow an upgrade route, but given the feature delta, why would you? You'll end up completely redesigning your infrastructure to take advantage of the new features anyways. This gives you some negotiating leverage. Even if you can't get VMware to drop the sticker price with your Microsoft broadsword of +1 to defection, try to at least get them to throw in some training. The features in 5.1 are awesome and you'll want to be trained up to take advantage of them.
If you are already using a competing product, does it make sense to go VMware? That's territory I really can't help you with. The economics of that have a lot more to do with the level of your extant investment and exactly how much of a break on licensing VMware is willing to give in order to get you to enter their ecosystem. All I can say is that right now the political and economic climate is such that there's never been a better time to try.
vSphere 5.1 is an evolutionary step over 5.0, not a revolutionary one. It is, however, something of a marvel of technology. Put all the bits together and you have an x86 mainframe on commodity hardware with added buzzwords and a way lower price tag. We've come a long way in ten years. The hypervisor itself is a commodity. With Microsoft finally catching up – and Openstack/Cloudstack not far behind – the basic management tools are a year or two away from that level as well.
Today, VMware's value proposition is not so much the raw technology of their products, but how neatly they are stitching it together. This is a battle being fought head-to-head with the king of integration: Microsoft itself. Right now, VMware have the upper hand. For VMware to survive, they are going to have to keep pushing the boundaries of automation, convenience and – ultimately – self-cannibalization. If they don't, Microsoft will gladly do it for them.
None of us have the remotest clue how this battle is going to shake out in the long run. What is or is not released at VMworld 2013 (and the quality of that product) will determine the datacenter pecking order for the rest of this decade. We can't know how that will play out and it is pointless to let such hypotheticals stall our decision making process.
What is on the table today is good. It is time tested and battle-hardened. It has a clearly definable value that even the densest of pointy haired bosses should be able to grasp. For all my trials and tribulations with this software, months of dedicated effort on my part have seen me unable to break vSphere 5.1 in any meaningful way. I'm willing to bet my company on it. What about you? ®