TPC kicks out quick-and-dirty virty server test
Good for calculating hypervisor overhead – maybe
For the past two and a half years, the members of the Transaction Processing Council consortium, which creates and audits transaction processing and data warehousing benchmarks for systems, has been working on a virtualization benchmark. The new test, called TPC-VMS, has finally made it out of committee and is ready for use.
Back in May 2010, when El Reg talked to members of the TPC who were cooking up what was then being called the TPC-Virtualization test, the thinking was that it would be used to show how databases performed atop server virtualization hypervisors and that the test would be roughly based on the TPC-E distributed benchmark.
The TPC-E test is an online transaction-processing workload that simulates the data processing related to running a web-based stock trading systems. It is a bit hairy – and expensive – to run. And somewhat disconcertingly, vendors were jumpy about customers making direct comparisons between TPC-E results on bare-metal database servers and virtualized machines.
This is, of course, all that any virty benchmark is really good for. You want to be able to quantify the overhead that virtualization imposes on database workloads. And the fact that the TPC members wanted to deliberately obfuscate that function shows how tough it can be to get consensus between hardware, operating system, and hypervisor suppliers who are all part of the consortium and who all have to give their consent when their software is used in a test.
As the TPC-Virtualization working group kept mulling over the virtualization benchmark, explains Intel benchmarking guru Wayne Smith, they came to the conclusion that what companies wanted was something a little more quick and dirty.
"Our goal was not to change the existing TPC benchmark test kits, or to have very few changes except where necessary," says Smith.
And so the new TPC-VMS benchmark allows vendors to run any of the four current benchmark tests under load on top of a hypervisor.
Those tests include the TPC-C online transaction processing test, its TPC-E follow-on mentioned above, the TPC-H data warehousing test, and its TPC-DS decision-support/big-data test, which debuted back in May and which does not have any performance data published yet.
With the TPC-VMS test, which is short for "virtual machine single system" – and yes, we know that's two Ss – there are only three rules. First, you have to equip the system under test that is running the database with a hypervisor that is virtualizing the I/O in the system. Two, you have to set up the hypervisor to support than three VMs that are identically configured, running the same software stack and TPC test.
The third rule is that a single instance of the hypervisor running the TPC-VMS test has to span the system under test. This is not precisely the same thing as saying one hypervisor per system. TPC is trying to discourage the use of clusters in the test, but clearly the vSMP multi-node hypervisor from ScaleMP could get some action here because technically it is a single hypervisor that spans up to 128 nodes. But for other hypervisors on x86, RISC, Itanium, and proprietary iron, it does come down to one hypervisor per physical system. Parallel Sysplex for IBM mainframes and NonStop for HP Integrities is not virtualized.
There's always some wiggle in the throughput you can get out of any virtual machine, so you report the performance of the VM that has the lowest throughput. By making vendors report the lowest number, says Smith, the TPC-VMS test incentivizes system testers to get the performance of all three partitions on a machine running at the same clip.
You might be thinking, why three partitions? Aside from three being the Magic Number, there is actually a reason, says Smith. "Two doesn't seem like enough and is an even number. Three is a prime number and we all know from history that prime numbers break things, and this is good. Four is an even number again, and five seems like too many partitions for a database server."
The TPC-VMS test is not limited to x86 architectures and their Hyper-V, ESXi, KVM, and Xen hypervisors; it is absolutely encouraged for vendors to run whatever servers with whatever hypervisors to show off the efficiency of their hypervisors. If the PowerVM, Solaris containers, and IntegrityVMs are all so good, it will be interesting to see if IBM, Oracle, and HP will want to show them off running databases.
The TPC-VMS test does not require that vendors do a baseline test of the same benchmark of their choosing running on a bare-metal database server, but it should have done that. Obviously. Smith just laughed when El Reg pointed this out, saying nothing else. We added that getting consensus across server, database, and hypervisor makers was probably a lot more difficult than it looked from the outside, and that if you wanted to get the marketeers on board with the test at all, the comparisons couldn't be so blatant. Otherwise, the comparisons would be useful and someone might not look as good as someone else. Smith didn't stop laughing then, either.
VMware has sponsored the VMmark benchmark for a number of years now, and you might be wondering what is wrong with this test. That is a mixed workload benchmark, not one aimed specifically at I/O-intensive database workloads. These heavy-I/O jobs are precisely the last workloads that are being virtualized, and hence these are the workloads that a new test has to cover to show that multiple databases can be run side-by-side on a virtualized machine and give decent performance.
Well, that's the theory, anyway. We'll see about that in practice.
Expect to see the first TPC-VMS results in the coming months. You can see the full spec for the test here. ®