Original URL: http://www.theregister.co.uk/2013/03/01/ibm_tpc_c_kvm_benchmark_test/

IBM runs OLTP benchmark atop KVM hypervisor

To heck with that TPC-VMS server virtualization spec

By Timothy Prickett Morgan

Posted in Servers, 1st March 2013 09:13 GMT

IBM has performed benchmark tests that provide some clarity on how transaction processing will perform in real-world virtualized environments – today's real world, that is.

One of the things that El Reg complained about when the Transaction Processing Council (TPC) trotted out its TPC-VMS server virtualization benchmark specification late last year was that it did not do what end users really needed: show the overhead that server virtualization puts on transaction processing workloads compared to bare metal.

IBM, however, is sometimes the sporting type, so it threw caution to the wind and fired up a TPC-C test running atop the KVM hypervisor from Red Hat without jumping through all of the TPC-VMS hoops.

With the KVM setup on a two-socket Xeon E5 server, which you can see here, IBM is not going through any of the TPC-VMS motions, which were deliberately chosen to obfuscate comparisons between physical and virtual iron to ascertain the virtualization overhead. This kind of comparison is important, particularly for I/O-intensive workloads like transaction processing.

In the early 1990s when the TPC-C online transaction processing benchmark test came out, memory was very expensive and the TPC-C databases had to be stored on disk drives. CPUs were more modestly powered, too, and were much more expensive, and the I/O subsystems in the server were stressed pretty heavily. But today two-socket servers can have multi-terabyte main memories and have lots of I/O capacity, and a big portion of the TPC-C database is basically running in memory.

How much, I do not know. It may not be an SAP HANA in-memory database setup, but with caching of information inherent in modern databases and operating systems, a modern server running the TPC-C test it is closer to a HANA appliance than to an essentially disk-driven database of the early 1990s. The disk drives are there for I/O, of course, but they are also there because the test required disk capacity to scale with end users.

Tests like TPC-C encouraged multicore chips with lots of memory capacity and bandwidth – and lo and behold, we got them and now think of it as normal. And many think benchmark tests are useless, and perhaps forget this effect on system architecture and software design. I would say that trying to cheat on benchmark tests drives actual innovation, if it is monitored carefully.

Of course, there is nothing quite as good as doing your own tests. It's a pity that more of you don't share your data with rags like El Reg and tell your peers what you did or did not get after you did a system upgrade or replacement. This would be truly fun, probably useful, and a big data project worth doing if only all of the data were available for millions of servers.

On the TPC-C setup that Big Blue tested running atop KVM, the company's performance-anxiety experts took a System x3650 M4 server with two eight-core Xeon E5-2690 processors running at 2.9GHz. To this machine, IBM added 768GB of main memory and four QLogic 8Gb/sec Fibre Channel adapters hooking out to nine DS3524 disk arrays and one DS3512 disk array with a mix of 44 hard disk drives and 72 SSDs with a total capacity of 40.3TB.

This particular box, running Red Hat Enterprise Linux 6.4 and IBM's DB2 9.7 relational database atop Red Hat's KVM hypervisor, was able to support 1,040,400 users and had a throughput of 1,320,082 transactions per minute on processing new orders. (The new order part of the TPC-C test is what gets counted in the official results, even though there are four other workloads that must be run, and usually burns a little less than half of the CPU.)

IBM did not run the TPC-C test on the same exact machine without KVM fired up. That would be too easy. But it did the next best thing: run the test on a Flex x240 node last summer that was, electronically speaking, nearly the same as the System x3650 M4 tested above running KVM.

That Flex x240 server had two Xeon E5-2690s running at the same 2.9GHz speed and with the same 768GB of main memory and two 8Gb/sec Fibre Channel cards linking out to a slew of DS35XX disk arrays. IBM had three DS3512 disks with 24 disks and 112 SSDs, and it could support 1,196,160 users and drive 1,503,544 new order transactions per minute. It ran RHEL 6.2 and DB2 9.7.

This Flex x240 box was a little heavier on the SSDs and a little lighter on the disk drives than the virtualized System x3650 M4 above. Assuming that disks were not a performance bottleneck on either machine and did not make a substantial difference in their performance (once you have sufficient disk and main memory to drive the CPUs, of course), and assuming that this disk and SSD capacity was attached to the system to hit the capacity limits required by the TPC-C test when driving more users (as you would expect a bare metal machine to do), then you might be tempted to think the higher number of users on this Flex x240 box compared to the System x3650 M4 was due to the fact it was running bare metal. You would also have to assume that the performance differences between RHEL 6.2 and 6.4 were pretty minor, too.

If you believe all that, then the KVM overhead in terms of transactions per minute is on the order of 12 per cent, and if you look at users it is around 13 per cent.

It would be so much more fun if companies like IBM would just test a bare metal machine and then run the same workload on the same exact iron using VMware's ESXi, Red Hat's KVM, and Microsoft's Hyper-V hypervisors. But again, that would be too easy and might help companies make odious comparisons and purchasing decisions.

The TPC-VMS test puts three copies of any of the current TPC tests – the older TPC-C OLTP benchmark, which emulates a warehouse operation (as in forklifts), the TPC-E online stock brokerage benchmark, the TPC-H data warehousing test, and its TPC-DS decision-support/big-data test – on a single instance of a hypervisor on a server with all of the I/O virtualized it. Three, is of course, the magic number.

No one has run a TPC-VMS test yet, by the way, and IT vendors might start doing what IBM just did and let people try to see the virtualization overhead. It is not like they won't find out, or that they don't get something for it in return.

Flexibility in configuration and management is something worth a bunch of CPU cycles. If it weren't, VMware would be a has-been and Xen, Hyper-V, and KVM would have never happened. ®