Kluster Kamph results sliced, diced, pulverized

Who won – and, more importantly, why?

Internet Security Threat Report 2014

HPC Blog With the dust settling on the ISC'13 Student Cluster Challenge, it's a good time to look back, take stock, and see what we've learned.

While traveling recently, I found myself unable to type on my laptop keyboard (after the idiot in front of me reclined into my lap), but I found I could still use the handy track point "nub".

So with a digit on the nub, I fired up a spreadsheet to attempt some one-finger analysis. I was curious to see what role cluster configuration might have played in the final competition results. (I used a different finger to express my feelings about the guy in 24F.)

The ISC Student Cluster Challenge is a series of "sprints" requiring students to run HPCC and a variety of HPC applications over a two-day period. They work on one application at a time, tuning and optimizing it to maximize throughput. And they can't just throw more hardware at the problem – there's a hard power cap of 3,000 watts.

With this in mind, let's take a look at how the various student clusters stacked up against each other and see if we can glean any insight. There was a lot of commonality between the teams, with all of them using Mellanox Infiniband interconnects and switches, and most running RHEL. But there were some major differences, which I've summarized in the charts below…

ISC'13 Student Cluster Challenge: total nodes chart

There are some clear differences in node count (above) and core count (below) in the 2013 competition.

ISC'13 Student Cluster Challenge: total CPU cores chart

South Africa and Tsinghua clearly have both more nodes and more cores than their fellow competitors, which certainly seemed to give them an advantage in terms of application performance.

We also see that there's quite a bit of difference when it comes to total memory: Tsinghua weighed in with a whopping 1TB.

ISC'13 Student Cluster Challenge: total cluster memory chart

Based on what I know about the scores for the individual applications, it looks like Tsinghua's massive memory gave them an advantage over the rest of the field, but not quite enough of an advantage to take the overall win.

ISC'13 Student Cluster Challenge: memory per CPU core chart

Tsinghua, Huazhong, and Chemnitz have double the RAM per core of other competitors, which certainly had a positive effect on performance. On the other hand, South Africa turned in solid app performance using less memory per core, but a large number of cores.

This is the first competition in which every team has used some kind of compute accelerator. Most went with Nvidia Kepler K20s, while Colorado and Purdue tried the new Intel Phi coprocessor on for size.

ISC'13 Student Cluster Challenge: accelerators (total in cluster) chart

One team, Germany's own Chemnitz, went wild on accelerators. They configured eight Intel Phi and eight Nvidia K20s into their cluster. While this gave them a hell of a lot of potential compute power, it also consumed a hell of a lot of electricity.

Difficulties also arose when the team tried to get both accelerators running on the same application, so they ended up running particular apps on the K20s and others on the Phis. The problem is that even when they're idled, these beasts consume significant power and generate heat as well. Chemnitz put together a monster cluster, but like most monsters, it couldn't be completely tamed.

Lessons Learned

So what have we learned? I think it's safe to say that more nodes and cores are better than fewer – now there's a blinding glimpse of the obvious, eh? There is also a relationship between lots of memory and higher performance – again, an obvious conclusion.

However, we see that more accelerators aren't necessarily more better. The winning teams had one accelerator per node, with Huazhong landing the top LINPACK with two accelerators per node.

All of the top teams were running Nvidia K20s rather than Intel's Phi coprocessor, but we can't put too much weight on this first Nvidia v. Intel showdown. This was the first time students had a chance to use the Phi, and they didn't get all that much time to work on optimization prior to the competition.

The upcoming cluster showdown at SC'13 in November will probably give us a better view of comparative accelerator performance. ®

Remote control for virtualized desktops

More from The Register

next story
The cloud that goes puff: Seagate Central home NAS woes
4TB of home storage is great, until you wake up to a dead device
Fat fingered geo-block kept Aussies in the dark
You think the CLOUD's insecure? It's BETTER than UK.GOV's DATA CENTRES
We don't even know where some of them ARE – Maude
Want to STUFF Facebook with blatant ADVERTISING? Fine! But you must PAY
Pony up or push off, Zuck tells social marketeers
Yahoo! blames! MONSTER! email! OUTAGE! on! CUT! CABLE! bungle!
Weekend woe for BT as telco struggles to restore service
Oi, Europe! Tell US feds to GTFO of our servers, say Microsoft and pals
By writing a really angry letter about how it's harming our cloud business, ta
prev story


Choosing cloud Backup services
Demystify how you can address your data protection needs in your small- to medium-sized business and select the best online backup service to meet your needs.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Choosing a cloud hosting partner with confidence
Download Choosing a Cloud Hosting Provider with Confidence to learn more about cloud computing - the new opportunities and new security challenges.
Intelligent flash storage arrays
Tegile Intelligent Storage Arrays with IntelliFlash helps IT boost storage utilization and effciency while delivering unmatched storage savings and performance.