Nehalem aces OLTP test on HP iron
Shames Shanghai, Dunnington
Yes, Nehalem is fast.
If you want to see the dramatic effect that Intel's move to the QuickPath Interconnect and integrated memory controllers has had on performance with its "Nehalem EP" Xeon 5500 processors, take a gander at the first results on the TPC-C online transaction processing benchmark test.
Hewlett-Packard's two-socket ProLiant DL370 G6 server is the first Nehalem EP box to be put through the TPC-C paces, and it blows away more expensive four-socket ProLiant machines using Advanced Micro Devices' "Shanghai" quad-core Opterons and even gives four-socket machines using Intel's own "Dunnington" six-core Xeon 7400s a run for their money.
To say that the old Xeon DP and Xeon MP processors and their frontside bus architecture were memory constrained is an understatement. And the Nehalem EPs not only put a whole box of nails in the coffin for two-socket Xeon DP servers, but they are sizing up the Dunnington Xeons for their coffin, even before the octo-core "Nehalem EX" chips for four-socket and larger servers arrive either later this year or early next year. For a lot of workloads, a two-socket Nehalem EP does the same work as a Dunnington server, which costs a lot more money.
On the TPC-C test run by HP, the ProLiant DL370 G6 was configured with two quad-core Xeon X5570 processors running at 2.93 GHz. That's a total of eight cores, but equally importantly, that's 16 processor threads for running applications. The DL370 was set up with the maximum memory possible on the box, which is 144 GB using 8 GB DDR3 DIMMs. Among other things, the TPC-C test measures how much disk I/O is needed to saturate the processors running a mix of OLTP applications that simulate the operations of a warehouse (a real one, with forklifts, not a data warehouse).
The test measures how many new orders the warehouse can process as a bunch of other transactions are being run at the same time. With the large amount of memory on two-socket servers (at least compared to when the TPC-C test came out in 1993), it takes a lot of disk drives to saturate the system as transactions are running. In the case of the ProLiant DL370 tested by HP, the server was equipped with four SAS RAID disk controllers inside the server chassis, each with six disks to store log and OS image data, and then forty MSA70 disk enclosures (with 25 36 GB, 15K RPM drives each), nine MSA2324fc Fibre Channel arrays (with 23 of the same drives each) plus a few more drives thrown in for good measure were added to the system for a total of 1,210 disks and 60 TB of disk capacity.
It took eight of HP's DL360 G5 servers to simulate the 500,000 end users driving the system, and the box, which ran Oracle Enterprise Linux (Oracle's clone of Red Hat's Enterprise Linux 5) to cut costs as well as Oracle's 11g Standard Edition One database, was able to process 631,766 TPC-C transactions per minute (TPM). The hardware in the setup, which was mostly the disk storage, cost $666,040, and three years of maintenance on the system cost $69,910. The software cost a mere $5,800, plus $10,497 for maintenance. Adding in the client hardware and software (which you have to do as part of the TPC-C test) pushed the price tag on this two-socket system to $802,683, but after a 15.5 per cent discount, the price dropped down enough to get the system down to $1.08 per TPM.
Sponsored: DevOps and continuous delivery