Feeds

Nehalems make like elephants on HPC memory test

Istanbul's touch of Alzheimer's

Top 5 reasons to deploy VMware with Tegile

Intel's Nehalem EP chip has significantly out-peformed AMD's Istanbul on a set a memory-intensive benchmark tests.

The techies at supercomputer cluster maker Advanced Clustering Technologies are at it again, running their own benchmarks on single server nodes using popular high-performance computing tests normally used on entire clusters. This time around, ACT is putting the latest x64 chips into two-socket systems and running the Stream memory benchmark on the boxes.

By running various HPC tests on single servers, ACT is helping educate customers on the pros and cons of the new Intel quad-core 'Nehalem EP' Xeon 5500 and Advanced Micro Devices six-core 'Istanbul' Opteron 2400 processors.

A few weeks ago, ACT cluster engineer Shane Corder published a report after he slapped the Linpack Fortran benchmark test on two-socket servers using these new chips.

On that test, one of ACT's Pinnacle rack servers equipped with two quad-core 2.66 GHz Xeon X5550s with 12 GB of DDR3 main memory running at 1.33 GHz was able to deliver 74.03 gigaflops of sustained performance against a peak theoretical performance of 85.12 gigaflops. But a Pinnacle machine configured with two of the six-core Opteron 2435 processors running at 2.6 GHz and 16 GB of DDR2 main memory running at 800 MHz was able to deliver 99.38 gigaflops (against a peak theoretical performance of 124.8 gigaflops).

So, AMD won that one - especially when you consider that the Opteron-based Pinnacle HPC node from ACT cost $3,500 compared to the $3,800 price on the Xeon-based Pinnacle box.

Now, with the Stream benchmark, the test is not about flops so much as memory bandwidth, and given the higher clock speed of the DDR3 main memory compared to DDR2 memory, you'd expect the Nehalem EP server node to do better than it did on the Linpack test. And indeed it did.

Corder's home-done Stream benchmark tests were done on exactly the same iron as the Linpack tests, and for good measure, Corder tossed in some numbers for older quad-core Xeons and Opterons to show how much better the new chips are versus the old.

The Nehalem EPs really cleaned the Istanbul's clocks on this test. Using 1.33 GHz DDR3 memory, the server using the X5570 processors was able to 37,122 MB/sec of bandwidth on the Stream test, while the machine equipped with 1.07 GHz memory modules hit 32,770 MB/sec and one using 800 MHz memory could handle 25,490 MB/sec. A Pinnacle server equipped with the earlier "Harpertown" Xeon 5400s - quad-core chips using the old frontside bus architecture and 800 MHz DDR2 main memory - could only deliver 9,776 MB/sec of bandwidth on the Stream test, and dropping down to 667 MHz memory pushed performance down to 6,102 MB/sec.

By contrast - and this is a big contrast - the Istanbul-based Pinnacle server using 800 MHz DDR2 main memory - as fast as it gets - topped out at 20,534 MB/sec of memory bandwidth on the stream tests, which was actually a little bit lower than the results ACT saw with a Pinnacle server equipped with quad-core "Shanghai" Opterons, which came in at 20,687 MB/sec. A server using the older quad-core "Barcelona" Opterons and 667 MHz DDR2 main memory was able to deliver 16,965 MB/sec on Stream.

As Intel has promised, ACT confirms that the Nehalem EP chips and their new QuickPath Interconnect bus architecture delivers nearly four times the memory bandwidth as its Harpertown predecessors, and nearly double the memory performance of the current crop of AMD Opterons. And there is nothing AMD can do about it until it switches to DDR3 main memory early next year with the "Magny-Cours" and "Lisbon" kickers to the Istanbuls.

AMD will be offering the G34 chipset with four DDR3 memory channels per socket (up to twelve DIMMs) and the C32 chipset with two channels per socket (up to four DIMMs). AMD's plan is to offer two different kinds of two-socket servers: one where memory bandwidth is key (that's the G34) and one where cheaper price and floating point or integer power are more important (that's the C32). AMD has the right idea. But it really needs this architecture to be here now to blunt Intel's considerable memory bandwidth advantage. ®

Beginner's guide to SSL certificates

More from The Register

next story
It's Big, it's Blue... it's simply FABLESS! IBM's chip-free future
Or why the reversal of globalisation ain't gonna 'appen
'Hmm, why CAN'T I run a water pipe through that rack of media servers?'
Leaving Las Vegas for Armenia kludging and Dubai dune bashing
Microsoft and Dell’s cloud in a box: Instant Azure for the data centre
A less painful way to run Microsoft’s private cloud
Facebook slurps 'paste sites' for STOLEN passwords, sprinkles on hash and salt
Zuck's ad empire DOESN'T see details in plain text. Phew!
CAGE MATCH: Microsoft, Dell open co-located bit barns in Oz
Whole new species of XaaS spawning in the antipodes
AWS pulls desktop-as-a-service from the PC
Support for PCoIP protocol means zero clients can run cloudy desktops
prev story

Whitepapers

Cloud and hybrid-cloud data protection for VMware
Learn how quick and easy it is to configure backups and perform restores for VMware environments.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Three 1TB solid state scorchers up for grabs
Big SSDs can be expensive but think big and think free because you could be the lucky winner of one of three 1TB Samsung SSD 840 EVO drives that we’re giving away worth over £300 apiece.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.