Original URL: https://www.theregister.com/2009/09/20/scalemp_xeon_5500/

ScaleMP certifies on Intel Nehalem iron

Bandwidth bonanza for virtual SMPs

By Timothy Prickett Morgan

Posted in HPC, 20th September 2009 03:02 GMT

The shift by Intel to the QuickPath Interconnect architecture with its Nehalem family of Xeon processors is paying off big time for ScaleMP. The high-end virtualization specialist is peddling a program called vSMP Foundation to mesh rack servers into a virtual symmetric multiprocessor for supercomputing workloads.

With vSMP Foundation, customers who want a big memory space for their applications to play in, but lack the budget to buy a real SMP box that can scale to terabytes of main memory, buy smaller rack servers instead. They use the vSMP Foundation software along with an InfiniBand fabric to lash the server nodes together into a coherent memory space for the operating system and its applications to play in.

The ScaleMP software stack might add around 25 per cent to the cost of an x64-InfiniBand cluster, according to Shai Fultheim, founder and CEO of ScaleMP. But the performance on various HPC workloads rivals that of a RISC/Unix box with real SMP electronics, which could cost three or four times as much. The workloads include data warehousing, various kinds of simulation and modeling, electronic design automation, and the like.

Common pool

The architecture of the current version of the vSMP Foundation code allows the memory of up to 16 physical machines to be placed in a common pool for a single instance of an operating system and its applications to run within. The memory coherency is ensured by a special virtual machine that ScaleMP installs on a USB stick that you plug into each server node. In essence this acts like a virtual BIOS to trick the operating system into thinking it is running atop a real SMP server.

You can use different types of servers in the virtual SMP, but the ScaleMP recommends you get matching server nodes as operating systems do not like asymmetric configurations, Fultheim says.

The vSMP Foundation product supports Ethernet as an interconnect, but to date all of ScaleMP's customers have opted for InfiniBand, which is so much faster. That could change as 10 Gigabit Ethernet and, soon, 40 Gigabit Ethernet go mainstream.

With the launch of the "Nehalem EP" Xeon 5500 processors in March, Intel has goosed the performance of the cores a bit and more importantly has integrated memory controllers onto the cores and ditched the old frontside bus architecture in favor of QPI's point-to-point links.

This has boosted the memory bandwidth of the quad-core Xeon 5500 processor by a factor of three or four compared to earlier "Harpertown" Xeon 5400 chips for two-socket servers. And while many applications have to be tweaked to get performance increases on the Xeon 5500s, the vSMP Foundation software is seeing immediate improvements, according to Benjamin Baer, marketing veep at ScaleMP.

Eye popping

Specifically, the aggregate memory bandwidth of a vSMP cluster has risen from 142 GB/sec to 445 GB/sec. As a consequence, a sixteen-node virtual SMP setup based on server nodes using the 2.93 GHz Xeon X5570 (with a total of 128 cores, the maximum that vSMP Foundation supports at the moment), is currently ranked seventh among all machines tested on the Stream HPC memory bandwidth benchmark. On this Stream test, the fully loaded vSMP setup was able to deliver 435 GB/sec of memory bandwidth running the triad of Stream tests, and an eight-node setup was able to hit 262 GB/sec.

By comparison, the 1,024-core Altix 4700 Linux super from Silicon Graphics could handle an order of magnitude more, at 4.1 TB/sec of bandwidth. This machine uses 1.6GHz, dual-core Itanium 2 processors. A 64-core Power 595 machine from IBM (using the dual-core Power6 processors running at 5GHz) can deliver 787 GB/sec, or a little less than twice the vSMP setup, while a Sun Microsystems and Fujitsu M9000 server with 128 cores (using the quad-core Sparc64-VII processors) hits 222 GB/sec. An Integrity Superdome machine from Hewlett-Packard, equipped with dual-core Itanium 2 chips and a total of 128 cores, delivers 167 GB/sec of bandwidth on the Stream test.

Baer says that ScaleMP is working on some "eye-popping" SPEC integer and floating point benchmarks, which it will show off in a few weeks.adds The company is also exploring how to tweak vSMP Foundation so it can support a wider variety of workloads. Maybe even databases and ERP applications someday.

At this point, Xeon 5500 boxes from Appro, Cray, Dell, HP, IBM, Intel (which does custom server installations for selected customers), Sun, and Super Micro have been certified to run vSMP Foundation. ®