Could you build a data-nomming HPC beast in a day? These kids can
Everything you need to know about the undergrad cluster sport
Analysis Student cluster-building competitions are chock full of technical challenges, both “book learning” and practical learning, and quite a bit of fun too. I mean, who wouldn't want to construct a HPC rig against the clock and kick an opponent in the benchmarks? Here's what involved in the contests.
Whether you’ve been following the competitions obsessively (sponsoring office betting pools, making action figures, etc.) or this is the first time you’ve ever heard of them, you probably have some questions.
There are rules and traditions in these competitions, just as there are in cricket, football and the Air Guitar World Championships. Understanding the rules and traditions will make these events more interesting.
Student clustering triple crown
In 2013, there will be three major worldwide student cluster events. The United States-based Supercomputing Conference (SC) held the first Student Cluster Competition (SCC) in 2007. The contest has been included at every November SC conference since, usually featuring eight university teams from the US, Europe, and Asia. As the first organisation to hold a cluster competition, SC has pretty much established the template on which the other competitions are based.
The other large HPC conference, the imaginatively named ISC (International Supercomputing Conference), held its first SCC at the 2012 June conference in Hamburg. The competition, jointly sponsored by the HPC Advisory Council, attracted teams from the US, home country Germany, and two teams from China. It was a big hit with conference organisers and attendees – big enough to justify expanding the field to include nine teams for 2013.
The third entry is the newly formed Asia Student Supercomputer Challenge (ASC). The finals of their inaugural SCC will be held this April in Shanghai, China (yes, the Shanghai in China). At least 30 teams from China, India, South Korea, Russia and other countries have submitted applications seeking to compete in the finals. This competition is sponsored by Inspur and Intel. More details in upcoming articles.
How do these things work?
All three organisations use roughly the same process. The first step is to form a team of six undergraduate students (from any discipline) and at least one faculty advisor. Each team submits an application to the event managers, answering questions about why they want to participate, their university’s HPC and computer science curriculum, team skills, etc. A few weeks later, the selection committee decides which teams make the cut and which need to wait another year.
The groups who get the nod have several months of work ahead. They’ll need to find a sponsor (usually a hardware vendor) and make sure they have their budgetary bases covered. Sponsors usually provide the latest and greatest gear, along with a bit of financial support for travel and other logistical costs. Incidentally, getting a sponsor isn’t all that difficult. Conference organisers (and other well-wishers like me) can help teams and vendors connect.
The rest of the time prior to the competition is spent designing, configuring, testing, and tuning the clusters. Then the teams take these systems to the event and compete against each another in a live benchmark face-off. The competition takes place in cramped booths right on the trade show floor.
All three 2013 events require competitors to run the HPCC benchmark and an independent HPL (LINPACK) run, plus a set of real-world scientific applications. Teams receive points for system performance (usually "getting the most done" on the scientific programs) and, in some cases, the quality and precision of the results. In addition to the benchmarks and app runs, teams are usually interviewed by experts to gauge how well they understand their systems and the scientific tasks they’ve been running.
The team that amasses the most points is dubbed the overall winner. There is usually an additional award for the highest LINPACK score and sometimes a “fan favourite” gong for the team that does the best job of displaying and explaining its work.
While many of the rules and procedures are common between competition hosts, there are some differences:
SC competitions are gruelling 46-hour marathons. The students begin their HPCC and separate LINPACK runs on Monday morning, and the results are due about 5pm that day. This usually isn’t very stressful; most teams have run these benchmarks many times and could do it in their sleep. The action really picks up Monday evening when the datasets for the scientific applications are released.
The apps and accompanying datasets are complex enough that it’s pretty much impossible for a team to complete every task. So from Monday evening until their final results are due on Wednesday afternoon, the students are pushing to get as much done as possible. Teams that can efficiently allocate processing resources have a big advantage.
ISC competitions are a set of three day-long sprints. Students run HPCC and LINPACK on the afternoon of day one but don’t receive their application datasets until the next morning. On days two and three, they’ll run through the list of workloads for that day and hand in the results later that afternoon.
The datasets usually aren’t so large that they’ll take a huge amount of time to run, meaning that students will have plenty of time to optimise the app to achieve maximum performance. However, there’s another wrinkle: the organisers spring a daily surprise application on the students. The teams don’t know what the app will be, so they can’t prepare for it; this puts a premium on team work and general HPC and science knowledge.
Sponsored: RAID: End of an era?