Nehalem aces OLTP test on HP iron
Shames Shanghai, Dunnington
Back in November 2008, HP tested a four-socket DL585 G5 server using AMD's Shanghai Opteron 8384 processor, which have four cores running at 2.7 GHz, for a total of 16 cores and 16 threads. Now, this DL585 G5 motherboard has 32 memory slots, but they are only DDR2 main memory, which runs slower than the DDR3 memory used with the Nehalems. This box, with twice as many sockets, nonetheless had 16 threads to run the database behind the TPC-C test, and even though it had 256 GB of memory and a very peppy HyperTransport interconnect (which Intel's QPI basically copies), AMD doesn't have simultaneous multithreading on the Opterons (which is dumb at this point).
Plus, the higher memory bandwidth of the two-socket Nehalem box allows it to best the four-socket Shanghai machine on the TPC-C test. The Opteron server had 732 disk drives (27.8 TB) and was able to process 579,814 TPM at a cost of 96 cents per TPM. The Opteron machine was running Windows Server 2003 and SQL Server 2005 (both at the R2 Enterprise x64 Edition SP2 level), so this might account for some of the performance difference. (This machine had a 16 per cent discount off list price for the hardware, software, and maintenance).
HP also tested a DL580 G5 server using Intel's six-core Dunnington chips back in January, and running the same Oracle Linux and 11g database setup, the Dunnington box, which used four of the six-core Xeon X7460 processors running at 2.67 GHz for a total of 24 cores, this DL580 box was able to process 639,253 TPM at a cost of 97 cents per TPM. Like the Opterons, the Dunnington chips do not support simultaneous multithreading (what Intel brands HyperThreading and which it smartly put into the Nehalem chips despite the extra transistors it requires), so 24 cores means 24 threads. That Dunnington machine tested by HP had 256 GB of memory, eleven disk controllers, and 1,052 disk drives (43.4 TB of capacity).
Here's the important bit: the DL580 G5 iron is a lot more expensive ($59,740 for the basic server, processors, and memory compared to $22,162 for the Nehalem EP-based DL370 G6 server) and because it is a four-socket box, it has to run the more expensive Oracle 11g Standard Edition (which costs $41,900 on the Dunnington box, compared to $12,700 for Standard Edition One, which is only available on two-socket boxes). The Nehalem EP box has almost as much main memory, as many execution threads, runs cheaper database software, does as much work, and costs about a third as much for the basic system - server, operating system, and database - not including the ridiculous amount of storage it takes to drive the TPC-C test.
Not that Dunnington machines have no place. IBM's four-socket System x3850 M2 server offers similar performance, at 684,508 TPM using the six-core Dunningtons (running the Windows 2003 and SQL Server 2005 combo), but costs a ridiculous $2.58 per TPM (even after a 34 per cent discount) because IBM charges too much for main memory and disk arrays. However, IBM doesn't stop at four sockets like other Dunnington machines, and its System x3950 M2 machine, which basically lashes two x3850s into a single NUMA cluster, can drive 1.2 million TPM at a cost of $1.99 per TPM after discounts. Again, the bulk of the cost of these machines is storage - the big one here has 143.3 TB of capacity across 1,931 disks, and those disks are necessary to drive the I/O behind the database transactions embodied by the TPC-C test, not for capacity.
The wonder, of course, is why HP and IBM didn't slap solid state storage in their machines and really bring down the disk drive count, and therefore the price. They'll figure this out sooner or later, thanks to the economic meltdown.
As for Xeon MP machines, don't expect a lot of traction until Intel delivers the Nehalem EX boxes. And the Nehalem EX machines are going to have to do a lot better than nine DIMM slots per socket, or a maximum of 288 GB of main memory, to impress a lot of data centers. There is some confusion as to whether the Nehalem EX machines will support FB-DIMM or DDR3 main memory, and how many channels will come out of the sockets. With 32 cores and 64 threads in a four-socket image, main memory really needs to be something closer to 576 GB on the Nehalem EX machines. That's the same 18 GB per core that the Nehalem EP gets in big configurations. ®
Sponsored: Benefits from the lessons learned in HPC