Finland beefs up HPC oomph with Cray 'Cascade' super

Stuffs old paper warehouse with gobs o' flops and bushels o' bytes

Boost IT visibility and business value

Finland's main academic supercomputing center, the IT Center for Science (CSC), has been embiggening its number-crunching and data storage capacity throughout 2012, and is at it again this week with the acquisition of a future "Cascade" supercomputer from Cray.

CSC is managed by the Finnish Ministry of Education and Culture, and supports a mix of academic, research, and corporate supercomputing across the country.

The CSC runs the largest supercomputer in Finland, the "Louhi" system, which is a mix of Cray XT4 and XT5 nodes using a mix of quad-core Opteron processors and the "SeaStar" XT interconnect that predates the current "Gemini" XE router that us used to lash CPUs and GPUs together in the current XE6 and XK6 machines.

This machine came online in the spring of 2007 and was upgraded two years later to its current configuration, which has 10,864 cores and delivers a sustained performance of 76.5 teraflops against a peak theoretical performance of 102 teraflops.

Finland CSC logo

CSC has used a mix of different platforms over the years, including vector and scalar machines from Cray and Power-based clusters from IBM. In the past decade, the Finnish center has also built clusters made out of HP Opteron-based blade servers and InfiniBand interconnects. The most powerful recent machine in this type was the "Vouri" cluster, with 3,264 Opteron cores and InfiniBand DDR switching that was installed in the summer of 2010; this cluster delivered 22.8 teraflops of performance on the Linpack test.

The Louhi and Vouri machines will be replaced with new systems as CSC moves into a new modular data center located in an old paper warehouse in Kajaani later this year. The HP system will be replaced first this year, and then the Cascade machine comes in next year as Cray rolls out this new design commercially.

The five-year deal with HP is worth €4.5m and is based on the SL65000 dense-packed modular server design. CSC has picked the ProLiant SL230s Gen8 server nodes, which have two Intel Xeon E5-2600 processors in a node and can put up to eight nodes into a 4U chassis. (HP has not yet formally announced these machines, oddly enough.)

The HP cluster will have 576 compute nodes with a total of 9,216 cores using eight-core Xeon E5-2600s, and a total of 40.5TB of main memory across those nodes.

The machines will be linked to each other through a 56Gb/sec (FDR) InfiniBand fabric from Mellanox Technologies, and CSC anticipates that it will have around 190 teraflops of peak theoretical floppage – about six times that of the current Vouri machine. And that's without resorting to using GPU coprocessors. This is, by the way, only the first phase of the new HP systems' build out.

Speaking of coprocessors, CSC's Kajaani data center will also be home to an experimental hybrid machine made up of Intel Xeon processors and Xeon Phi coprocessors (formerly known as MIC or "Knights Corner"), as well as Tesla GPU coprocessors from Nvidia.

This machine, built by Russian supercomputer maker T-Platforms, will be owned by the Swiss National Supercomputing Centre (CSCS) and the Amsterdam Foundation for Academic Computing, is based on the company's "T-Rex" hybrid supercomputer. It will be installed in stages, based initially on the current V5000 blade servers, which have room for ten two-socket x86 server nodes in a 5U chassis; this machine will go into the CSC data center in the third quarter of this year.

The hybrid T-Platforms machine will eventually consist of 256 T-Rex nodes with a mix of Tesla and Xeon Phi coprocessors; not much is known of the T-Rex design at this point, but the company says that the machine will use hot-water cooling in the racks, will be based on Intel processors as well as the Tesla and Xeon Phi coprocessors, and will have an aggregate peak performance of 400 teraflops. This machine is being funded by the Partnership for Advanced Computing in Europe (PRACE).

Finland CSC's Kajaani data center

Finland CSC's Kajaani modular data center under construction

The Finns will shell out €10m for their Cascade super, which is based on Intel's Xeon processors (unlike the AMD Opterons in the current XE6 and XK6 supers) and the "Aries" high-speed router interconnect. The Gemini interconnect was a geared-down version of Aries that the US nuke labs asked Cray to make because they didn't want to wait for Aries to come along in 2013.

Not much else is known about Aries, except of course that now Intel owns it. CSC is not providing the feeds and speeds of the Cascade machine it is acquiring from Cray, but does say that it will consist of both products and services and that the vast majority of the system will be supplied in 2014, with an initial installation beginning in late 2012 when the US Defense Advanced Research Projects Agency, which has foot the development bill for Cascade over the past couple of years, gets the first machine off the Cascade production line.

Presumably, the Cascade machine at CSC will be based on future "Ivy Bridge" Xeon E5 processors from Intel, and it could even have coprocessors from Intel and/or Nvidia, as well.

Cray has said that it will support the Xeon Phi coprocessors, which are baby x86 engines that operate in parallel, stuffed on what looks like a graphics card (because that is basically what it was supposed to be), but has not confirmed that it will support future "Kepler" and "Maxwell" Tesla GPU coprocessors – which of course Cray will, since it needs to allow HPC shops to upgrade to from the XK6, which marries Opterons to Tesla M2090 GPUs.

Cray is not being tapped to build the storage behind these supers. Back in March, the Finnish super center inked a €2.5m contract with systems maker Fujitsu, which plays heavily in Europe as well as in Japan. In this case, CSC is working with Fujitsu to actually get an undisclosed amount of storage made by sometimes-rival, sometimes-partner Hitachi Data Systems.

CSC also inked a €2.5m contract with DataDirect Networks to build and support an SFA 10K clustered storage array, a multi-petabyte tape library, and data migration software to move data from disk to tape and back again if necessary as jobs run.

El Reg contacted Cray for some more precise configuration details on the Cascade machine going into CSC, and the company spokesperson said no more details were being divulged.

Assuming, however, that half of the cost for the $40m Cascade machine going into the US Department of Energy's Berkeley Lab, announced two weeks ago, was for storage and half for compute, and included multiple years of support for both the 2 petaflops Cascade machine and the 6 petabytes Sonexion storage, then the CSC Cascade machine, which is all compute, should weigh in at around 1 petaflops of peak performance.

The other interesting bit about the Kajaani facility being built by CSC is that it has contracted with Silicon Graphics not for its servers, but for its ICE Cube Air containerized data centers. The ICE Cube Air can deliver a power-usage effectiveness of 1.08 or less, which is considerably lower than the 1.2 PUE that CSC has hopes for with the Kajaani site.

As you might expect, the Kajaani site was chosen because it is cold a lot of the year, but the problem in this case is that it can be -45 degrees Celsius, and that is a bit too cold for computers. SGI had to create modified ICE Cube Air containers to house the servers and storage. SGI inked a €2.6m contract with SGI for these modified containers three weeks ago. ®

The essential guide to IT transformation

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Oracle reveals 32-core, 10 BEEELLION-transistor SPARC M7
New chip scales to 1024 cores, 8192 threads 64 TB RAM, at speeds over 3.6GHz
Docker kicks KVM's butt in IBM tests
Big Blue finds containers are speedy, but may not have much room to improve
US regulators OK sale of IBM's x86 server biz to Lenovo
Now all that remains is for gov't offices to ban the boxes
Gartner's Special Report: Should you believe the hype?
Enough hot air to carry a balloon to the Moon
Flash could be CHEAPER than SAS DISK? Come off it, NetApp
Stats analysis reckons we'll hit that point in just three years
Dell The Man shrieks: 'We've got a Bitcoin order, we've got a Bitcoin order'
$50k of PowerEdge servers? That'll be 85 coins in digi-dosh
prev story


5 things you didn’t know about cloud backup
IT departments are embracing cloud backup, but there’s a lot you need to know before choosing a service provider. Learn all the critical things you need to know.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.
Rethinking backup and recovery in the modern data center
Combining intelligence, operational analytics, and automation to enable efficient, data-driven IT organizations using the HP ABR approach.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.