Feeds

Kluster Kamph results sliced, diced, pulverized

Who won – and, more importantly, why?

The essential guide to IT transformation

HPC Blog With the dust settling on the ISC'13 Student Cluster Challenge, it's a good time to look back, take stock, and see what we've learned.

While traveling recently, I found myself unable to type on my laptop keyboard (after the idiot in front of me reclined into my lap), but I found I could still use the handy track point "nub".

So with a digit on the nub, I fired up a spreadsheet to attempt some one-finger analysis. I was curious to see what role cluster configuration might have played in the final competition results. (I used a different finger to express my feelings about the guy in 24F.)

The ISC Student Cluster Challenge is a series of "sprints" requiring students to run HPCC and a variety of HPC applications over a two-day period. They work on one application at a time, tuning and optimizing it to maximize throughput. And they can't just throw more hardware at the problem – there's a hard power cap of 3,000 watts.

With this in mind, let's take a look at how the various student clusters stacked up against each other and see if we can glean any insight. There was a lot of commonality between the teams, with all of them using Mellanox Infiniband interconnects and switches, and most running RHEL. But there were some major differences, which I've summarized in the charts below…

ISC'13 Student Cluster Challenge: total nodes chart

There are some clear differences in node count (above) and core count (below) in the 2013 competition.

ISC'13 Student Cluster Challenge: total CPU cores chart

South Africa and Tsinghua clearly have both more nodes and more cores than their fellow competitors, which certainly seemed to give them an advantage in terms of application performance.

We also see that there's quite a bit of difference when it comes to total memory: Tsinghua weighed in with a whopping 1TB.

ISC'13 Student Cluster Challenge: total cluster memory chart

Based on what I know about the scores for the individual applications, it looks like Tsinghua's massive memory gave them an advantage over the rest of the field, but not quite enough of an advantage to take the overall win.

ISC'13 Student Cluster Challenge: memory per CPU core chart

Tsinghua, Huazhong, and Chemnitz have double the RAM per core of other competitors, which certainly had a positive effect on performance. On the other hand, South Africa turned in solid app performance using less memory per core, but a large number of cores.

This is the first competition in which every team has used some kind of compute accelerator. Most went with Nvidia Kepler K20s, while Colorado and Purdue tried the new Intel Phi coprocessor on for size.

ISC'13 Student Cluster Challenge: accelerators (total in cluster) chart

One team, Germany's own Chemnitz, went wild on accelerators. They configured eight Intel Phi and eight Nvidia K20s into their cluster. While this gave them a hell of a lot of potential compute power, it also consumed a hell of a lot of electricity.

Difficulties also arose when the team tried to get both accelerators running on the same application, so they ended up running particular apps on the K20s and others on the Phis. The problem is that even when they're idled, these beasts consume significant power and generate heat as well. Chemnitz put together a monster cluster, but like most monsters, it couldn't be completely tamed.

Lessons Learned

So what have we learned? I think it's safe to say that more nodes and cores are better than fewer – now there's a blinding glimpse of the obvious, eh? There is also a relationship between lots of memory and higher performance – again, an obvious conclusion.

However, we see that more accelerators aren't necessarily more better. The winning teams had one accelerator per node, with Huazhong landing the top LINPACK with two accelerators per node.

All of the top teams were running Nvidia K20s rather than Intel's Phi coprocessor, but we can't put too much weight on this first Nvidia v. Intel showdown. This was the first time students had a chance to use the Phi, and they didn't get all that much time to work on optimization prior to the competition.

The upcoming cluster showdown at SC'13 in November will probably give us a better view of comparative accelerator performance. ®

Boost IT visibility and business value

More from The Register

next story
Pay to play: The hidden cost of software defined everything
Enter credit card details if you want that system you bought to actually be useful
Shoot-em-up: Sony Online Entertainment hit by 'large scale DDoS attack'
Games disrupted as firm struggles to control network
HP busts out new ProLiant Gen9 servers
Think those are cool? Wait till you get a load of our racks
Silicon Valley jolted by magnitude 6.1 quake – its biggest in 25 years
Did the earth move for you at VMworld – oh, OK. It just did. A lot
VMware's high-wire balancing act: EVO might drag us ALL down
Get it right, EMC, or there'll be STORAGE CIVIL WAR. Mark my words
Forrester says it's time to give up on physical storage arrays
The physical/virtual storage tipping point may just have arrived
prev story

Whitepapers

Top 10 endpoint backup mistakes
Avoid the ten endpoint backup mistakes to ensure that your critical corporate data is protected and end user productivity is improved.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Backing up distributed data
Eliminating the redundant use of bandwidth and storage capacity and application consolidation in the modern data center.
The essential guide to IT transformation
ServiceNow discusses three IT transformations that can help CIOs automate IT services to transform IT and the enterprise
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.