Japan takes the Top 500 lead with K super
The mother of all Sparc systems
Two other Cray systems made the top ten portion of the list, and they were proper XE6 boxes. The new "Cielo" super at Los Alamos National Laboratory, based on eight-core Opteron 6100 processors and using the Gemini interconnect, enters the list at number six. With 142,272 cores, it comes in at 1.11 petaflops of sustained performance at an efficiency of 81.3 per cent, which is not too shabby. Number eight on the list is another XE6 machine, called "Hopper" and using twelve-core Opteron 6100s, at the DOE's Lawrence Berkeley National Laboratory; this machine has 153,408 cores and has a sustained performance of 1.05 petaflops (81.8 per cent efficiency).
Number seven on the list is the "Pleiades" Xeon cluster using InfiniBand interconnect at NASA Ames Research Center built by Silicon Graphics. This system has 111,104 cores and delivers 1.09 petaflops of number-crunching oomph (82.7 per cent efficiency).
Number nine on the list is the Tera-100 parallel cluster built by Bull for the Commissariat a l'Energie Atomique (CEA) in France. Tera-100 is based on Intel's Xeon 7500 high-end processors and Bull's bullx supercomputer blades; it uses QDR InfiniBand to lash the nodes together and is rated at 1.05 petaflops, unchanged from the November 2010 list when it entered the Top 500 rankings.
IBM's largest machine on the June 2011 list, the "Roadrunner" hybrid Opteron-Cell blade super, was a contender for the Top 500 roost a few years back but after sitting still, other machines have blown by it. (Beep, beep!) The Roadrunner machine, which was the first machine to break the petaflops barrier and which fell behind Jaguar eighteen months ago, has a combined 122,400 cores across its Opteron and Cell processors and delivers 1.04 petaflops of performance on Linpack (at an efficiency of 75.7 per cent).
The age of petaflops is upon us
The June 2011 list is the first time since the Top 500 was cataloged beginning in 1993 that all of the top ten machines were rated at petaflops or more. And it won't be too long before 10 petaflops will be the ante to get into the upper echelons of the list. Here's how the current projections look:
Exascale: easy to build, hard to power and cool
While CPU clusters are dominating the top part of the Top 500 list at this moment, don't draw the wrong conclusion from this. There are specific cases where larger numbers of scalar or vector processors with proprietary interconnects are going to be necessary for a particular set of code. But in many cases, the cheap and low-powered flops of GPUs or other kinds of coprocessors – Intel's x64-based parallel Knights processors, FPGAs, or other gadgets – are going to be the only way a lot of organizations are going to be able to afford to do their supercomputing. Thus far, there are two GPU-accelerated machines on the Top 10 and a total of 17 machines using GPUs on the entire Top 500 list. Of these accelerated machines, a dozen use Nvidia GPUs, five use IBM Cells, and two use Advanced Micro Devices Radeon graphics cards.
It is early days for accelerated, hybrid supercomputing. But there is a general consensus that you can't just keep scaling up with x64, Power, or Sparc processors indefinitely without having to put in a few nuclear power plants alongside an exascale-class to juice it up. Optimistic vendors think we can get to exascale machines by 2018, maybe a little later, if we can solve some pretty hefty engineering problems. The problems always look insurmountable at the time, as they did breaking the gigaflops, teraflops, and petaflops barriers. This time, though, the thermodynamics issues are truly staggering.
If you add up all the number-crunching power of the machines on the Top 500 list, you get 58.88 petaflops, which is up 34.7 per cent from the November 2010 list and up 81.7 per cent from the 32.4 aggregate petaflops on the June 2010 list. To get onto the Top 100 part of the list this time around, you needed a machine with 88.92 teraflops. The smallest machines on the list (in terms of performance) are a pair of BladeCenter blade servers at an unnamed manufacturer in China using IBM's HS22 blades and quad-core Xeons rated at 40.2 teraflops.
IBM may not have a lot of machines near the top of the list – expect to see some 10 petafloppers from Big Blue either later this year or early next year – but in terms of total computing oomph on the list, Big Blue still has a big slice of the pie:
The Top 500 aggregate flops pie (by system capacity not count)