Cluster kids: Meet the students giving all for science and HPC glory
If LINPACK would pack wood... and other computer sports posers
Posted in HPC, 22nd June 2017 08:09 GMT
HPC Blog As part of our continuing coverage of the most exciting events in the computer sporting world, Student Cluster Competitions, we like to meet each team individually.
Through the miracle of video, we can share these conversations with you, the cluster competition aficionados. Use these interviews to help you get to know the teams, their strategies, and to see if you can see the fire in their eyes and the hunger in their souls. This will help you do a lot better in your office betting pools.
Team Beihang (China): The first team up in our video countdown is Team Beihang from China. This team finished in second place at the Asian Student Cluster Competition (ASC1 7) in April.
At this point, according to the team leader. They’re organized like most teams, with each student working on a specific application, sometimes in partnership with another team. This is a team that has a few competitions under their belt, so their confidence is justified. But it’s way early in this competition and anything can happen.
Team CHPC (South Africa): This is a team that has earned an unprecedented three victories in the four years they’ve competed at ISC. When we caught up to them for our interview, they’re ‘riding the line’ with their benchmarking. What this means is that they’re very close, but just under, the 3,000 watt power cap. You want to ride the line, it’s a good thing.
The team seems comfortable at this point in the competition. But also slightly nervous in my opinion, which is normal for a CHPC team. Past CHPC teams always seemed to assume that they’re behind the other teams, which makes them work harder, I think.
Team EPCC (UK): The team from Edinburgh in their last outing a few years ago took home the highest LINPACK award. Although this team is made up of different students, the goal seems to be the same – take home LINPACK gold.
In fact, the coach of the EPCC team was on that champion LINPACKing team, so he’s been able to guide them in their quest for LINPACK gold. The team is running the same type of hardware configuration as they did before – liquid cooled (from Cool IT) and plenty of accelerators.
Team FAU (Germany): This is one of two German teams in the competition. In the video, I first make them give me the real, hard core, German pronunciation of their school name. That covered, we talked a little bit about their last outing at the ASC17 event. The FAU team had some problems getting their InfiniBand interconnect running properly.
However, they’re not having those problems this year. All nodes are happily talking to each other at high speed. Check out the video to get a look at “the guy without a face” and “the surfer guy”.
Team Hamburg (Germany): This is the fourth outing for University of Hamburg. While they haven’t yet grabbed a major award, they’ve steadily improved and could be poised for a breakthrough. This year, they’re driving Intel KNL processors, which makes them unique in this year’s competition.
One of the students is sporting quite the sunburn, which looks incredibly painful, but he’s taking it in stride in true student cluster competition tradition. They also have a guy with one of the coolest hair architectures in the competition.
Team Nanyang (Singapore): When we catch up to Team Nanyang, they’re feeling good about their HPCC, HPCG, and LINPACK benchmarks. We talk about the MiniDFT application an I ask if they’ve ‘stolen’ any of the coding tricks from the other students (which is perfectly legal and part of that application challenge).
The team leader feels that their code for MiniDFT is perfectly fine, thank you very much, so he didn’t take any of the free IP available to them. Two of their students had internships with ClusterVision and gained some valuable experience in TensorFlow, which might pay off handsomely.
Team NERSC (US): The NERSC-ies are back for a second year of competition at ISC. They’re the same team as last year, with one new addition from Texas State. This team is entirely composed of current or former NERSC interns, but from different geographies, which must make it challenging to collaborate.
In the video, we discuss the various application challenges, along with a tangent on the differences between HPL and HPCG. We even get into some tuning talk about HPCG, which was pretty interesting to me – a guy who really doesn’t know much about the nuts and bolts. Finally, we arrived at the proper pronunciation of the application name “FEniCS”. Good talk.
Team Purdue/NEU (US): This is a combination team, with the highly experienced Purdue team coupled with the somewhat experienced Northeastern University. In our conversation, we talked about how everyone was required to post their MiniDFT code publicly, and how the team found some code passages that they’re going to borrow/steal from other teams (which is totally legal).
This team is sporting 16 NVIDIA P100 GPUs, which is a huge power load. At 250 peak watts each, just the GPU portion of their cluster could draw up to 4,000 watts – which is 1,000 more watts over the barrier. They even blew their power supplies twice in warm ups…..wow. We also talk about FEniCS and how it has something like four million dependencies.
Team Tsinghua (China): This is one of the most accomplished teams in this competition – both as an institution and as a team. Former teams have won all three major cluster competitions, and one team won all three competitions in the same year (2015), which was astounding. This particular team just won the ASC17 competition in glamourous Wuxi, China.
This has to be one of the favorites to win the Ultimate Championship at this year’s ISC. They have the experience necessary to deal with the benchmarks and applications, and the value of that can’t be underestimated.
Team U Mass (US): This is another combination team, this time a combination of U Mass and U Mass Dartmouth. The team is a bit short handed, since two of their members didn’t make the trip to Frankfurt. But this hasn’t seemed to throw them even a little bit.
The team is driving an unconventional cluster. It only has one ‘real’ node, but is backed up by a brace of NVIDIA Jetson processors on mini motherboards. They’re highly confident about TensorFlow, believing that the judges will be impressed with their approach. Teams from Boston seem to be somewhat cursed. They’ve always had problems getting their systems delivered, having problems with codes, etc. Can this Boston team break the curse and bring home a trophy to the land of chowder? Time will tell.
Team UPC (Spain): We talk to Team Spain at a tense moment. They were just reaching the power consumption peak of their LINPACK run. While they’re talking to me, they’re just barely under the 3,000 watt power cap.
The team is once again riding an ARM-based cluster. This one is a 768 core beast that is also water cooled, courtesy of Cool IT. In the video, we discuss the various applications and the challenges they present. I bring out my own thoughts about ‘stealing’ codes for MiniDFT – my opinion is that hack artists borrow form others while great artists steal. (In fact, I’ve stolen that quote.)
Towards the end of the video we find out whether the UPC team system stayed under the power cap and got in a successful LINPACK run – or if they broke through the cap. The suspense is palpable.
Next up we’ll be talking to the teams again – our “last chance” interview to see how they’re bearing up under the strain of the competition. We might even catch a couple of coaches for some “Coach Chat” segments. Stay tuned for more of our comprehensive coverage….