Top 500 Supers: Détente in East, West petaflops race
An arsenal of big iron deploying soon
Number two son
Number two on the list is the Tianhe-1A hybrid super at the National Supercomputing Center in Tianjin, China. The Tianhe-1A mixes six-core Intel Xeon processors, Nvidia Tesla GPUs, and a smattering of homegrown Sparc processors (China is making its own Sparc, MIPS, and Alpha processors – it's like the late 1990s meets the early 2010s.) The resulting machine hit 2.56 petaflops on the Linpack test and hasn't changed a bit since it entered the list a year ago. The Tianhe-1A ceepie-geepie has 14,336 Xeon processors and 7,168 of Nvidia's Tesla M2050 fanless GPU coprocessors. It uses a homegrown tray server design and a proprietary interconnect called Arch, which glues together the 186,368 cores in those CPUs and GPUs. The Tianhe-1A machine has a peak theoretical performance of 4.7 petaflops, so it is only running at 54.6 per cent efficiency running the Linpack test, which is pretty bad compared to the Fujitsu K box, which is delivering 93.2 per cent efficiency. Tianhe-1A is not doing any worse than other monster ceepie-geepie machines.
Traditional HPC means Linux. Next!
The top machine in the US is still the "Jaguar" Cray XT5 system at Oak Ridge National Laboratory, which has 224,162 Opteron cores and which is rated at 1.76 petaflops of sustained performance on the Linpack test. This machine ranks third on the list, and is in the process of being transformed into a 20 petaflops hybrid CPU-GPU XK6 super, a deal that Cray took down last month. The US Department of Energy is shelling out $97m for that upgrade, which will combine the Opteron 6200 processors from AMD with the future "Kepler" GPU coprocessors from Nvidia. Oak Ridge is hoping to hit an exaflops – one thousand petaflops or one million teraflops – sometime in the 2018 timeframe.
China's "Nebulae" ceepie-geepie, built from Xeon processors and Nvidia Teslas, is installed at the National Supercomputing Center in Shenzhen and ranks fourth on the Top 500 list. Nebulae has a total of 120,640 cores across its CPUs and GPUs, which are housed in a blade server chassis crafted by Chinese server maker Dawning. It has 1.27 petaflops of sustained performance on the Linpack test.
Number five is the is the Tsubame 2.0 super, built from Hewlett-Packard's ProLiant SL390s G7 tray servers and sporting Xeons CPUs and Nvidia Tesla GPUs and some help from prime contractor NEC – a political necessity to get a machine installed at the Tokyo Institute of Technology (and why Sun Microsystems, which built Tsubame 1.0, needed NEC as a prime contractor, too). Tsubame 2.0 has 73,278 cores and a Linpack sustained performance of 1.19 petaflops.
Cray's "Cielo" XE6 super at Los Alamos National Laboratory, based on eight-core Opteron 6136 processors and using the "Gemini" XE interconnect, ranks number six, as it did in June. It has 142,272 cores and delivered 1.11 petaflops of sustained Linpack.
Silicon Graphics, which just last week said it had scored a deal with NASA to upgrade the "Pleiades" Xeon cluster to 10 petaflops over the next several years, has spot number seven with the current implementation of the Pleiades machine. It has 111,104 Xeon cores of various vintages and is installed at NASA's Ames Research Center; it delivers 1.09 petaflops of number-crunching oomph.
Cray holds the number eight position on the November 2011 list with an XE6 machine called "Hopper" based on the twelve-core Opteron 6100s processors. This machine is installed at the DOE's Lawrence Berkeley National Laboratory and has 153,408 cores; it delivers 1.05 petaflops.
Number nine on the list is the Bullx parallel cluster built by Bull for the Commissariat a l'Energie Atomique (CEA) in France, which is based on the Xeon 7500 processors from Intel and uses Quad Data Rate (QDR) InfiniBand to link nodes. It delivers 1.05 petaflops as well on the Linpack test.
That leaves number 10, the machine that first broke the petaflops barrier and which is the poster child for ceepie-geepies: IBM's "Roadrunner" hybrid Opteron-Cell blade super. This machine is rated at 1.04 petaflops.
Those 10 machines are the only ones to break the petaflops barrier as gauged by the Linpack test. (No one is claiming that is a perfect test here at El Reg, by the way.) There are a whole bunch of machines that come close.
The University of Stuttgart has an XE6 machine using the new Opteron 6200s that rates 831 teraflops. Meanwhile, the Sunway BlueLight parallel machine, which was announced two weeks ago at the National Supercomputing Center in Jinan, China – and using what is rumored to be a modified Alpha processor – ranked number 14 on the list with 795 teraflops of sustained performance. And Appro International has a new Xtreme-X machine at Lawrence Livermore National Laboratory that using Intel's Xeon E5 chips and hit 773.7 teraflops on the test with its 46,208 cores.
In fact, there are another 10 systems on the list that have more than 500 teraflops of sustained performance.