Top 500 supers - rise of the Linux quad-cores
Jaguar munches Roadrunner
The Jaguar system at Oak Ridge has been upgraded to the six-core Istanbuls in recent months. It now has 224,162 cores running at 2.6 GHz, and it has 1.76 petaflops of aggregate sustained number-crunching performance as gauged by the Linpack Fortran benchmark test. The Jaguar box doesn't have any fancy-schmancy co-processors, but Oak Ridge just announced in October that it has received funding to build a supercomputer that uses Nvidia's future Fermi graphics processing units and CUDA programming environment.
Oak Ridge did not specify how it would make use of the GPUs. But it is possible that they will be added to the Jaguar box in a hybrid architecture akin to the one embodied in the Roadrunner machine built by IBM for Los Alamos.
Roadrunner was reconfigured last month and began its classified computing work for the US Department of Energy, and in that reconfiguration, the machine had a few nodes busted out and is now rated at just a hair over 1 petaflops using a mix of 1.8 GHz dual-core Opterons and 3.2 GHz PowerXCell 8i co-processors (for a total of 122,400 cores) and InfiniBand interconnect to link nodes and PCI-Express to link the Cell chips to the Opteron sockets. The machine is nonetheless still ranked number two on the Top 500 list, although it may not be for long.
Number three on the super ranking is the new "Kraken" XT5 system built by Cray for the University of Tennessee, which like the Jaguar machine is based on the six-core Istanbul chips running at 2.6 GHz. Kracken has 98,928 cores and is rated at 831.7 teraflops of floating point oomph.
The "Jugene" BlueGene/P parallel super installed at the Forschungszentrum Juelich in Germany comes in at number four on the Top 500 list, rated at 825.5 teraflops using 850 MHz PowerPC 450 cores. This machine came online for the June 2009 list and has not changed this year.
Perhaps the most interesting new machine in the stratosphere of the Top 500 list is the Tianhe-1 hybrid supercomputer installed at the National Supercomputer Center in Tianjin, China. Tianhi, which means "River in the Sky" in Chinese, will be used to do aircraft design and oil exploration and will be the anchor of a national supercomputing grid for the northeast region of the country.
Tianhi-1 is comprised of Xeon server nodes using a mix of E5540 and E5450 processors, with each node configured with two of AMD's Radeon HD 4870 graphics cards to be used as co-processors. The machine has 71,680 cores and is rated at 563.1 of sustained teraflops and 1.2 petaflops of peak theoretical performance. That might be awful in terms of efficiency, but the machine is important because it puts China in the top five and it shows that you can build a powerful machine using a mix of off-the-shelf CPUs and GPUs to get a powerful cluster, even if it is inefficient.
Rounding out the top ten systems on the list are machines that used to be a lot closer to the top. The "Pleiades" Altix cluster build by Silicon Graphics for NASA Ames, rated at 544.3 teraflops and using quad-core Nehalem processors on their blades, is number six, followed by the 478.2 teraflops BlueGene/L massively parallel machine at Lawrence Livermore National Laboratories (and the top-ranked super on the November 2007 list) at number seven. Argonne National Laboratory's 458.6 teraflops BlueGene/P is number eight, and Sun Microsystems' "Ranger" Opteron-based blade cluster at the University of Texas is rated at 433.2 teraflops.
While Sun has been quiet about most things server related since Oracle announced its $7.4bn deal to acquire the company back in April, Sandia National Laboratories has tapped Sun to build the "Red Sky" blade cluster. This machine uses Intel's quad-core Xeon 5570 processors and Sun's x6275 blades and InfiniBand switches and is rated at 423.9 teraflops.
Next page: The politics of petaflops