Original URL: http://www.theregister.co.uk/2013/11/21/gpu_accelerators_overrun_green500_list_of_energyefficient_hpc_systems/

Green supercomputer benchmarks make boffins see red, check blueprints

Top 10 use GPU brawn, but has anyone any bright ideas on better juicy tests?

By Rik Myslewski

Posted in HPC, 21st November 2013 12:14 GMT

SC13 The biannual Green500 list of the most energy-efficient supercomputers has broken new ground in two important ways: for the first time an HPC system broke the 4 gigaflops per watt barrier, and also for the first time all the top 10 systems benefitted from GPU acceleration. Then there's a third bit of note: the benchmarks are rubbish.

Green500 energy-efficient supercomputer list - top 10, November 2013

Ten wins for Intel Xeon, ten wins for Nvidia Tesla (click to enlarge)

The list, measured in floating-point operations per watt when running the Linpack benchmark, was announced Wednesday at the SC13 supercomputing conference in Denver, Colorado. The top finisher, the 2,720-core TSUBAME-KFC from the GSIC Center, Tokyo Institute of Technology may have topped the Green500 energy-miser list, but it ranked at number 311 on the "performance is all we care about" Top500 list announced Monday.

TSUMABE-KFC is a purpose-built prototype created specifically at GSIC to study advanced cooling and low-power supercomputing, and its designers did their jobs well, with the system scoring a cool 4,503.17 megaflops per watt (Mflops/W).

To put that achievement in perspective, the number-one ranking in the previous Green500 list, published in June 2013, was held by the Eurora system from Italy's Cineca, which produced 3,208.83 Mflops/W to win the crown. Not too shabby, not too shabby at all – but TSUMABE-KFC bested Eurora by a full 40 per cent.

For fans of truly big iron, perhaps most the most interesting system on the list is Piz Daint from the Swiss National Supercomputing Centre (CSCS), which ranked number four with a megafloppage per watt of 3,185.91 (more on that score later). Piz Daint is the highest-ranking petaflop-capable super on the Green500 list, and also scored an impressive sixth place on the Top500 list.

By way of comparison, the two systems ahead of Piz Daint in the Green500 were ranked in triple digits in the Top500: Cambridge University's Wilkes was number two in the Green 500 but 166 in the Top500, and the University of Tsukuba, Japan's Center for Computational Sciences HA-PACS TCA was number three in the Green500 and number 134 in the Top500.

Piz Daint's performance in the Top500 was notable in one other way. In the June 2013 list it ranked 42; in the November list it rocketed up to number six. The reason? The addition of Nvidia Tesla K20X GPU accelerators to its 28-rack Cray XSC30 system.

Which leads us to the second notable thing about the new Green500 list: of the top 10 performers, all ten were boosted by Nvidia Tesla GPU accelerators: seven were equipped with Tesla K20X cards, two with K20s, and one with a K20M – which is essentially a K20.

In June, by comparison, only three of the top 10 systems had GPU accelerators: two with Nvidia Tesla K20s, and one with AMD FirePro S10000s. Another had Intel Xeon Phi coprocessor cards; no Xeon Phi–equipped system made the top 10 of the most recent Green500 list.

All well and good, but...

During a Wednesday session at SC13 discussing the Green500, all participants agreed – including representatives from Green500 itself – that testing, scoring, and ranking systems based on their power consumption an inexact science in need of repair.

Although Green500 does publish rules governing the running of energy-measurement tests for the submission of scores for ranking, and has collaborated with the Energy Efficient High Performance Computing Working Group (EE HPC WG) on a three-level methodology [PDF] for testing, all the HPCers involved freely admit that much more work needs to be done to clarify the testing and reporting procedures.

Of those three levels of testing, Level 1 is the simplest, and is always required for Green500 entry submission. Its simplicity, however, is almost an understatement: in Level 1 testing, the only subsystem under test is the compute system – forget about storage and networking. They don't appear until Level 2, in which "all subsystems participating in the workload must be measured or estimated."

Leaving aside the question of the difficulty of accurately and discretely measuring power on any subsystems at any level, the temptation to keep the testing as simplistic as reasonably possible lies in the fact that the higher the testing level, the lower the Mflops/W rating. Usually. Mostly. Probably. Often. Sometimes. Maybe.

During the session, Thomas Schulthess of CSCS, Piz Daint's home, was adamant that Level 3 – which takes the Level 2 rules and makes them more stringent – is the only truly legitimate way of measuring a system's power consumption, although Level 2 is acceptable, as well. Level 1 is his bête noire.

"I am a physicist by training," he said, "a professor of physics at ETH, and I have to measure that one number – that's the true number. There are no two different numbers or two different efficiencies in systems."

When Schulthess ran Level 1 testing on Piz Daint, he obtained one number; when he ran Level 3 testing, the score was more accurate but less impressive. The difference was more than enough to make the "true number"-seeking physicist in him uncomfortable, seeing as how the Green500 accepts Level 1–tested submissions.

Running Linpack on Piz Daint and using Level 3–class analysis, Schulthess and his team came up with a score of 3,186 Mflops/W. Running under Level 1 rules, that score jumped to 3,864Mflops/W.

"People said we should have submitted this" higher Level 1 number to Green500, he said, "but it is wrong. It is the wrong number." So Schulthess submitted a Level 3 score even though his competition was allowed to submit Level 1 scores. Piz Daint ended up at number four instead of number two.

"Every center, every system owner and system operator, is responsible to publish the right number," he said. "It's just the way that things are done in science, and I hope the supercomputing community adheres to the same rules. I'm not sure, from the numbers I've seen, whether this is always the case."

It must be quickly noted that your Reg reporter could detect no rancor among the participants in the SC13 discussion on how to best and honestly measure energy efficiency of HPC system. Instead, the participants evidenced a sincere desire to get it right.

Green500 and interested members of the HPC community are going to be hammering away at this problem in the coming months. If you're interested in joining the discussion, you can contact them on their website.

Nobody – nobody in their right mind, that is – ever said that benchmark development was easy. ®