HPC cluster-wrestling challenge: Come on students, light my PyFR
Think you know your hardware? Think again
Reconfigure This! Here’s some more content from last summer’s blockbuster International Super computing Conference (ISC) 2015 student cluster competition, which has the youngsters battling to build a small cluster of their own design on the exhibit floor and race to demonstrate the greatest performance across a series of benchmarks and applications.
As we rejoin the teams, they’re on the last day of the tourney. Some might think it’s clear sailing ahead, but there's a final set of rocks that the kids have to manoeuvre past.
I’m calling it the “Great Balls of PyFR” challenge. It’s designed to test how well the kids know their hardware, the PyFR application, and how to get the most out of both. The rules are simple: teams will have 45 minutes to run a single PyFR dataset (which is supplied beforehand) using the least amount of current possible.
Before the timed run, the kids get two hours to reconfigure their hardware to minimise power draw.
The teams don’t get any extra credit for finishing before the 45 minute time limit, so the key is to run just enough hardware to limp over the finish line, using the least amount of power possible.
It’s the most difficult ‘mystery challenge’ that I’ve seen and it certainly has the kids thinking, as you’ll see in the stories and videos below ...
Team Estonia: Massive monitor small consolation for snake bit team
Team Estonia had been through quite a bit by the time I caught up with it on the last day of the competition. Maybe the best term for its 2015 ISC effort is "snake-bit" or maybe "ill fated". It just couldn’t catch a break.
A missing shipment forced it to strip the entire Frankfurt area of PC power supplies to run its mini-motherboard cluster. It had to make its own power cables, and one team member accidentally cut himself in the process – adding human blood to its configuration.
In the video, we discuss these topics with the team, plus marvel at its massive 110-inch, LCD monitor. Towards the end, it’s apparent that the team is somewhat sad and disappointed with its results. One thing that most teams believe is that it's the only ones who has trouble during these competitions.
I try to disabuse it of this notion by telling it how most (if not all) other teams also stumble during the tourneys, and that I’ve never seen any team turn in a set of perfect results. I hope it helped raise its spirits and feel better about what it's accomplished.
Team Hamburg: New cooler, no beer (yet)
When I came upon the team on the last day, I noticed an addition to its booth configuration – a cooler chock full of beverages. Surprisingly enough, it wasn’t beer, but soft drinks. I was assured that beer would be coming soon after the competition concluded for the day. I mean, they ARE Germans, right?
Not a lot to report from Team Hamburg. It's working on LAAMPS, but also preparing for the ‘secret task’ which is a rerun of the PyFR application with a twist, as we’ve discussed above.
The Hamburgers are pondering their configuration moves, but don’t have a lot to share yet. With eight nodes of CPU and no accelerators, they don’t have to juggle the CPU versus GPU conundrum that other teams are wrestling with. Still, it's going to have to disable some hardware in order to get its power consumption down, while making sure it finishes up under the 45-minute time limit.
Team Spain: Howling system raises goosebumps on ARMs
Even though I was standing right in front of Team Spain, it was still very hard to hear it talking due to the unholy yowl emitted by its ARM-based system. What I managed to get from it is that it doesn't think it'll need to make many changed to its system in order to compete on the PyFR challenge. Its use of ARM cores already makes it one of the lowest powered clusters in the competition.
The team seems more relaxed today, probably because the PyFR task is aimed at energy efficiency, which is squarely in its ARM-fueled wheel house. However, PyFR is also computationally intense, and it’s no sure thing that its system, powered by Mali GPUs, is able to complete the task even with the full 270 core configuration. We’ll see what happens.
In the video, my trusty wired mics pick up their voices just fine. I also included a snippet of sound from right in front of their cluster, so you can get a taste of what I’ve been talking about with its highly annoying/evil/demonic volume and pitch.