Cell supers rule the Green 500 HPC rankings
Just don't ask what the price would be for a petaflops
Back in November, at the annual Supercomputing event that was held in Austin, Texas this time around, a bunch of supercomputing researchers released the semi-annual Top 500 rankings of the fastest supercomputers in the world. Now, another set of nerds has added power metrics to the list and done a sort on how efficiently the supers deliver floating-point performance. What is immediately clear from the Green 500 ranking is that performance and power efficiency do not bear much relation in supercomputing - at least not yet.
The Top 500 list of supercomputing sites ranks the sustained number-crunching performance of a supercomputer (regardless of architecture, or despite it, if you want to be more accurate) on a set of Fortran benchmarks called Linpack. The list is compiled by Erich Strohmaier and Horst Simon, computer scientists at Lawrence Berkeley National Laboratory, Jack Dongarra of the University of Tennessee, and Hans Meuer of the University of Manheim.
The list is useful in gauging the bleeding edge of supercomputing technology, the prevalence of different operation systems, server nodes, architectures, and interconnection schemes, among lots of other data. (You can read our coverage of the November 2008 Top 500 supers list here.)
The Green 500 resorting of the Top 500 list is put together by Wu-chun Feng and Kirk Cameron of Virginia Tech. This is the fourth Green 500 ranking, but only the second one that has been made public. (We covered the first public one a few months ago here.)
According to Wu and Cameron, about half the machines in the Top 500 provided measured electric usage in conjunction with their Linpack results, allowing for a simple calculation of megaflops per watt. The other half have been given an estimated power usage by Wu and Cameron, and then the list (available as an Excel spreadsheet here) is sorted for power efficiency instead of raw performance. Power on the list is measured in kilowatts, and power efficiency is measured in megaflops per watt. (This is not obvious in the sheet.)
The first thing that jumps out in this fourth Green 500 list is that machines have broken through the 500 megaflops per watt barrier. These breakers are all relatively small blade boxes made by IBM, based on its QS22 PowerXCell 8i (the latest iteration of its nine-core Cell PowerPC chip) using InfiniBand interconnect between the blades. The biggest box on the Top 500 list, the "Roadrunner" hybrid Opteron-Cell blade box that has broken the 1 petaflops performance barrier, is notable because it is delivering 444.9 megaflops per watt of computing efficiency; the machine consumes 2.48 megawatts of juice, however.
There is one caveat, though. A base QS22 blade with two Cell chips and 8 GB of memory costs $9,995, which is about four times the cost of an x64 blade. So you can either pay big electricity bills or higher server costs, or try to mix x64 and Cells and find yourself somewhere in the middle.
It wasn't all that long ago that IBM was a joke in supercomputing, and the best gear that the company could put into the field to compete against then-dominant Cray and its vector machines and a slew of smaller and clever super makers was a 3090 mainframe equipped with vector co-processors. That began to change in 1995 with the "Deep Blue" chess-playing box, which was commercialized two years later as the RS/6000 PowerParallel.
IBM also built clusters of its Power-based AIX boxes, using high-speed, proprietary interconnect, and started taking its wares into the government labs of the world. These days, IBM dominates the Top 500 list with several different architectures - including BlueGene massively parallel machines with hundreds of thousands of cores, hybrid Opteron-Cell machines, clustered AIX-Power boxes, and clustered x64 machines using as mix of processors from either Intel or Advanced Micro Devices.
IBM has taken high performance computing very seriously in the past decade, and it has also been on the vanguard of the energy efficiency - and boy does it show on the Green 500 list. The top 20 machines on the list are made by IBM, and they average 402.4 megaflops per watt in computing efficiency. Of the 186 machines in the Top 500 and Green 500 lists that bear the IBM label, the machines are averaging 134.5 megaflops per watt.
Across the whole Green 500 list, the average machine consumes 400.9 kilowatts of power and delivers 98.6 megaflops per watt. This is terrible, by comparison, and is a reflection of some very heavy-hitting machines that are long in the tooth still being on the list. The least energy efficient machine is an Itanium machine installed at Lawrence Livermore National Lab that is rated at 19.9 teraflops, but which burns 4.9 megawatts of juice and therefore yields only a little more than 4 megaflops per watt.
Number 499 on the Green 500 list is the Earth Simulator, a massively parallel vector machine built by the Japanese government that used to be the top of the performance list (and for many years). While Earth Simulator was ground-breaking when it delivered 35.9 teraflops of sustained performance, those vector processors sure do eat juice. Earth Simulator consumes 3.2 megawatts and yielded a paltry 11.2 megaflops per watt.
In terms of other vendors and their rankings on the Green 500 list, Silicon Graphics comes in with a pair of Altix ICE x64 blade clusters at numbers 21 and 22 on the list that deliver 240 and 233 megaflops per watt, respectively. The interesting thing here is that the slightly more efficient box, installed at oil giant Total, is rated at 106 teraflops and comes in at number 17 on the Top 500 list, while the less efficient box (but only moderately so) is the new "Pleiades" Altix ICE cluster installed at NASA's Ames Research Center.
That machine is delivering 487 teraflops of sustained performance on the Linpack test, and is delivering 233 megaflops per watt of power efficiency. While that is half that of the Cell or Opteron-Cell hybrid machines, the Altix architecture would seem to scale performance and power consumption linearly - something SGI is sure to emphasize. Moreover, with the advent of "Nehalem" Xeon boxes early next year, it is not hard to envision the Altix ICE machines being as power efficient as Cell-based boxes.
The most power efficient Cray box on the Green 500 list is the new "Franklin" XT4 massively parallel Opteron cluster running at Lawrence Berkeley National Laboratory, which is number seven on the Top 500 list at 266 teraflops. However, Franklin burns 1.15 megawatts of juice, yielding 231.6 megaflops per watt of computing efficiency.
And while Sun Microsystems is justifiable happy to have the "Ranger" Opteron-InfiniBand cluster installed at the University of Texas, the cluster, which is ranked at number six on the Top 500 list with 433.2 teraflops of sustained Linpack performance, Ranger needs 2 megawatts to run and drops to number 30 on the Green 500 list, at 216.6 megaflops per watt. Once again, with Intel's Nehalem chips, Sun should be able to get its power efficiency up there near the 500 megaflops per watt barrier, but it seems unlikely that UT is going to do a box swap.
And for all its talk about the density and power efficiency of the new iDataPlex blade servers, these custom-built IBM boxes are not anything to write home about in terms of power efficiency on HPC workloads. The two iDataPlex boxes installed at NASA's Goddard Space Flight Center are rated at 192.1 megaflops per watt using 2.5 GHz quad-core Xeon processors. Even with a doubling should Nehalem yield that much more bang for the watt, iDataPlex is not going to be as power efficient as Cell-based machines.
The most power efficient box from Hewlett-Packard on the Green 500 list is a cluster of BL460c and BL2x220 blade servers (the latter puts two server nodes on a single blade) that delivers 218 megaflops per watt while burning 327 kilowatts to run. This box is located at the Joint Supercomputer Center in Russia, and is rated 71.3 teraflops; it is 35 on the Top 500 list and 27 on the Green 500 list. HP is clearly going to be banking on two-node Nehalem blades next year to get it back into the power efficiency game in the HPC racket in 2009. ®