Related topics

Gung-ho Guangzhou college kids smash LINPACK cluster record

Oh snap! Don't TELL me you brought K20s to a K40s party

HPC blog It was the plucky home team from the capital of Guangdong which grabbed the crown and cash at this year's Asian Supercomputer Conference (ASC) Student Challenge.

The students from Guangzhou’s Sun Yat-sen University screamed happily when they saw the results from the LINPACK portion of the ASC’14 Student Supercomputer Challenge.

In addition to besting their peers, Team Sun Yat-sen also set a new LINPACK cluster competition world record with their 9,272 GFLOP/s score. This not only gives the Yat-sen’ers bragging rights and a nice trophy, it also scores the team a 10,000 Chinese yuan cash award (about £955 or $1,600). Not too shabby, eh?

As you can see from the chart, the biggest fight was between HUST and NTHU (Team Taiwan) for second place, with Nanyang Tech right behind.

Nanyang Tech grabbed fourth place, comfortably ahead of NUDT, a team known for their LINPACK prowess.

Sun Yat-Sen’s score tops the previous 8,455 GFLOP/s high watermark set by HUST at ISC’13 last summer in Leipzig, Germany.

The systems sported by the top five finishers for the LINPACK crown (there isn’t a real crown, pity) are very similar, as you can see in the table below.

All of the top finishers ran either eight or nine nodes, with either 216 or 192 CPU cores, and were all using like amounts of memory. They also jammed their luggage full of NVIDIA Tesla cards to slide into their Inspur-provided clusters at the Guangzhou tourney.

The top teams were flogging NVIDIA K40s, with Sun Yat-sen and HUST sporting eight and nine respectively. NTHU was right there, hardware-wise, but lagged just a little behind second place HUST.

Nanyang pulled solid results out of their mix of active and passively cooled K40s. While the K20s consume less power, they also provide slightly less performance. If Nanyang had gone with all actively cooled GPUs, would it have given them a large enough score to land them in third place? Hard to say, the performance difference between the two parts isn’t all that much, but, Nanyang only needed to improve their score by 3.51 per cent to pass NTHU.

It looks like NUDT was simply outgunned in the fight for highest LINPACK. Like wearing brown shoes with a black tuxedo, NUD brought K20s to a party where everyone else was wearing K40s. The K40s have more memory, more CUDA cores, greater memory bandwidth, all of which combine to give the K40s at least a 10% performance advantage over the K20s they replace.

But there isn’t much difference at all between Sun Yat-sen and HUST. In fact, by looking at the configurations, many would conclude that HUST should win due to their one GPU edge. So how did Sun Yat-sen pull off the victory? And how did they manage to put so much distance between themselves and second place HUST?

My guess would be that Sun Yat-sen simply ran closer to the 3,000 watt power line than the other teams.

I’m not sure what mechanism the organisers used to monitor and enforce the power cap. But it could be possible for any one of the teams to run above the power cap for very short periods of time and then drop back down before the monitoring system catches the violation – although of course one could not say that this happened.

Historical context

Once again, we see that the top student LINPACK line taking a significant leap upward. It’s amazing to see how far the students have come since the SC088 – SC10 days, when the "big news" in 2010 was that three teams finally broke through the 1 TFLOP/s barrier.

How long until we see 10 TFLOP/s? We’re close, very close, right now.

Could better power management or a slightly more optimised system give a team what they need to get to 10 TFLOP/s? It’s achingly close, with only an additional 7 per cent performance increase over Sun Yat-sen’s current record.

We could see this record fall sooner rather than later. The next major student competition, the ISC Student Cluster Challenge, takes place in Leipzig, Germany, later this month. We’ll be there covering the event from cluster assembly to final teardown, as usual.

First up, we’ll have interviews with the organisers and a look at the field (including a wagering pool, of course). So stay tuned… ®

Sponsored: 10 ways wire data helps conquer IT complexity