Cray beats out SGI at German HPC consortium
Climbing to 2.6 petaflops with a "Cascade" super
The Höchstleistungsrechnen is going to have a different brand name and architecture on it at the Norddeutschem Verbund für Hoch- und Höchstleistungsrechnen (HLRN) supercomputing alliance in Northern Germany, now that Cray has beat out SGI for a big bad box that will have a peak theoretical performance in excess of 2 petaflops.
The HLRN consortium includes Berlin, Bremen, Hamburg, Mecklenburg-Vorpommern, Niedersachsen, and Schleswig-Holstein, and now Brandenburg, which has just joined up. As 2012 came to a close, HLRN said that it was shelling out $39m to get a shiny new "Cascade" XC30 supercomputer from Cray to replace a mix of Altix Xeon clusters from Silicon Graphics that it uses to run various simulations from two data centers, one at the Zuse Institute in Berlin and the other at the High Performance Computing Center (RRZN) at Leibniz University in Hannover.
The win by Cray is no doubt a disappointment for SGI, which has had its share of troubles in Europe in the past year, but will help bolster Cray's financials in the next two years.
The main computers in use by HLRN are a pair of Altix ICE 8200EX machines with 10,240 Xeon cores running at 2.93GHz or 3GHz. The machines each have a peak theoretical performance of 120.7 teraflops and deliver 107.1 teraflops on the Linpack Fortran benchmark, giving them the ranks of numbers 274 and 275 on the Top500 supercomputer rankings from last November. If you want to be generous and consider the two Altix clusters as a single but geographically distributed system, at 241.4 teraflops the machine would rank as number 100 on the list. Not exactly a big HPC monster by modern standards.
But 2.6 petaflops is still a pretty hefty machine, even if it will not rocket to the upper stratosphere of the Top500 list, where the biggest box, Oak Ridge National Laboratory's "Titan" Cray XK7 CPU-GPU hybrid, weighs in at 27.1 peak petaflops.
According to the specifications for the HLRN-III supercomputer, the system will once again be split in two with one half in Berlin and one half in Hannover.
The HLRN supercomputing system in North Germany
During the first phase of system construction in the autumn of 2013, the initial XC30 system will go in with 1,488 dual-socket processor nodes sporting the next-generation "Ivy Bridge" Xeon E5 processors from Intel. The assumption is that the top-end Xeon E5 2600 v2 processors will sport ten cores compared to the "Sandy Bridge" v1 chips and their eight cores. So this machine should have a total of 29,760 cores in the initial stage, all linked together using the "Aries" dragonfly interconnect.
The first stage of the HRLN-III machine will have 93TB of main memory across those nodes, which works out to a fairly modest 32GB per socket. This will link into a 2.8PB Lustre file system that runs on an FDR (56GB/sec) InfiniBand network. The Cascade box will also link to another 1PB file system running NFS over 10 Gigabit Ethernet.
This first phase will also see a 32-node Xeon E5 2600 v2 cluster with 256GB per node and a dual-rail FDR InfiniBand network, and it looks like it will be running ScaleMP's vSMP, which makes a cluster look and act like a symmetric multiprocessing cluster if it is running Linux. This cluster will be used for pre- and post-processing of data coming into and out of the Cascade box.
A year later, in the fall of 2014, HLRN-III will be extended to a total of 3,552 two-socket nodes on the XC30, all chatting away over the Aries interconnect. Each node will have 256GB of main memory, and use what the Germans simply call "next generation" Xeon processors, so that probably means "Haswell" Xeons.
The machine will have 222TB of main memory in total, which works out to the same 32GB per socket, although the Haswell core count could be as high as a dozen per socket. That Lustre file system will be extended to 7.2PB and that NFS file system will stay the same at 1PB. That SMP cluster will double up to 64 nodes and have the main memory doubled up to 512GB of some of the nodes and be kept at 256GB on others.
The whole shebang will run Linux, of course, and specifically, the Cray Linux Environment, which is a variant of SUSE Linux Enterprise Server, on the XC30 and SLES proper on the adjunct cluster.
No word on what will happen to the two halves of the HLRN-II machine, which is obviously going to be put to use over the next year. As the state of New Mexico found out recently, it is hard to find a buyer for a supercomputer that is many years old because of the energy it consumes to run and the heat it dissipates makes it economically unattractive. ®
Re: Split supercomputer?
The split supercomputer is especially interesting, because it makes twice as many constituencies happy. I don't think a big search for hidden performance / cost advantages would be useful.
hmmm.... a couple of those racks would sit nicely next to my Onyx, Octane, O2 and Indigo............
I find the idea of the split supercomputer interesting. Although this seems silly at first glance, I think it is actually sensible. The much lower bandwidth between the two halves* would absoultely cripple certain workloads (a weather model, for instance, traditionally each node would end up walking through a pretty large portion of system memory as it calculated, meaning a shared memory supercomputer is a must.) But, the reality is this type of system doesn't tend to run one giant workload anyway, the tendency is to have a number of jobs running on it at any given time, and also a lot of jobs would not walk through system memory the way a weather model would. Either one of these, this lets each university have a local resource with (presumably) automatic use of excess on the other university's system, which is great.
*I don't know what the speed will be, but certainly below the 56GB/sec infiniband local to each half!