Italian 'Eurora' supercomputer pushes the green envelope

Besting Cray and IBM in the energy efficiency game

Combat fraud and increase customer satisfaction

The "Eurora" supercomputer that was just fired up in Italy may not be large, but it has taken the lead in energy efficiency over designs from big HPC vendors like Cray and IBM.

The new machine was built by Eurotech, a server maker with HPC expertise that is based on Amaro, Italy, in conjunction with graphics chip and GPU coprocessor maker Nvidia. Based on initial tests Linpack Fortran benchmark tests, it would be more energy efficient than either IBM's massively parallel Power-based BlueGene/Q machine or the hybrid ceepie-geepie XK7 machine from Cray.

Eurotech sells Intel Xeon and Advanced Micro Devices Opteron servers, and in the case of Eurora, the company is matching its Aurora Tigon – as in part tiger, part lion – water-cooled servers with a special variant of Nvidia's "Kepler2" Telsa K20X GPU coprocessors, which were announced last November. The machine is being installed at the Cineca supercomputer center in Bologna, which is a member of the Partnership for Advanced Computing in Europe (PRACE) effort in the European Union to push toward exascale computing.

The Aurora Tigon servers are a blade design based on Intel's Xeon E5-2600 processors. Instead of heat sinks, the processors have flat metal plates where water blocks can be attached to take heat away with water that is at tap temperature. Eurotech calls this hot water cooling, but it is really warm water cooling.

Nvidia is shipping Eurotech a special version of the Tesla K20X GPU coprocessor that doesn't have a heat sink but a metal plate as well, and Eurotech has designed its own system board so it can have two processor sockets, main memory, and two GPU coprocessors all on the same thin board.

This is a similar approach to what Russian supercomputer maker T-Platforms did back in the summer of 2011 when it was building a blade server for Moscow State University, although in that case the blade had two low-voltage Xeon L5630 processors and two X2070 embedded GPU coprocessors. Water blocks go on all four computing elements to take the heat away rapidly and efficiently.

Sumit Gupta, general manager of the Tesla Accelerated Computing business unit at Nvidia, says that the custom K20X part does not have a name and not just anybody can get these parts. The trick for Eurora that has driven up efficiency, he tells El Reg, is that Cineca has figured out how to take one eight-core Xeon E5 processor on the Eurora blade and have it drive both GPU coprocessors, thus leaving the other CPUs in the machine capable of doing other calculations. On the Linpack run done by Cineca, only one of the two CPUs on each blade was used to drive the CPUs.

This begs the question as to why there are two sockets on any server if you can drive two GPUs. It comes down to legacy software support. Even if you have new-fangled apps that can run in ceepie-geepie mode, that doesn't mean all of your applications have been ported and you still need to run them in CPU-only mode.

The Eurora supercomputer built by Eurotech and Nvidia

The Eurora supercomputer built by Eurotech and Nvidia

This very modest yet highly efficient machine has a total of 64 compute nodes, each with two Xeon E5 processors and two of the custom K20X GPUs. This machine had a bottle of bubbly broken on it earlier in the week at Cineca, and had been tested to run the Linpack test (on which the Top500 and Green500 supercomputer rankings are based) at a sustained 110 teraflops of number-crunching performance. This was accomplished consuming 34.7 kilowatts, which yields a very impressive 3,150 megaflops per watt.

If you look at the November 2012 Green500 supercomputer rankings, you will see that a hybrid Xeon E5-Xeon Phi cluster called "Beacon" based on Cray/Appro's GreenBlade delivers 2,499 megaflops per watt and currently is the most energy-efficient supercomputer in the world.

The top-end "Titan" XK7 machine at Oak Ridge National Laboratory, which burns 8.21 megawatts and delivers 17.59 petaflops sustained performance on Linpack, yields 2,143 megaflops per watt. This is the most powerful (in terms of oomph) machine in the world.

The former champ of energy efficiency, IBM's BlueGene/Q, is coming in at 2,102 megaflops per watt on the "JuQueen" super at Forschungszentrum Juelich and a little less than that on the four times as large (and more powerful) on the "Sequoia" machine at Lawrence Livermore National Laboratory.

It doesn't take a supercomputer to see that 3,150 megaflops per watt is a big leap – about 47 per cent more power efficiency than the Titan machine.

Of course, Eurora is not a full supercomputer with only 110 teraflops sustained. This is a perfectly respectable performance for a midrange box, though, and a fully loaded Aurora Tigon rack can hold 256 Tesla K20X GPU coprocessors and 256 Xeon E5 processors for a combined 350 teraflops with one of those CPUs deactivated.

And, says Eurotech and Nvidia, you could build a 3.1 petaflops system with just nine racks. It used to take several hundred racks and several hundred million dollars (at least) to do that. Now, you can do it on the cheap. How much, Cineca and Eurotech are not saying.

What they will say is that this hybrid approach with warm water cooling can cut electric bills by around 50 per cent compared to using chillers to cool the air in a standard data center, and that TCO compared to plain vanilla x86 clusters is anywhere from 30 to 50 per cent better.

This latter comparison pits an 1,800-node cluster using cold air against a similar sized cluster using water blocks and warm water cooling; it does not take into account a shift of number crunching from CPUs to GPUs. ®

Combat fraud and increase customer satisfaction

More from The Register

next story
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Kingston DataTraveler MicroDuo: Turn your phone into a 72GB beast
USB-usiness in the front, micro-USB party in the back
Dropbox defends fantastically badly timed Condoleezza Rice appointment
'Nothing is going to change with Dr. Rice's appointment,' file sharer promises
BOFH: Oh DO tell us what you think. *CLICK*
$%%&amp Oh dear, we've been cut *CLICK* Well hello *CLICK* You're breaking up...
AMD's 'Seattle' 64-bit ARM server chips now sampling, set to launch in late 2014
But they won't appear in SeaMicro Fabric Compute Systems anytime soon
Amazon reveals its Google-killing 'R3' server instances
A mega-memory instance that never forgets
Cisco reps flog Whiptail's Invicta arrays against EMC and Pure
Storage reseller report reveals who's selling what
Microsoft builds teleporter weapon to send VMware into Azure
Updated Virtual Machine Converter now converts Linux VMs too
prev story


Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Top three mobile application threats
Learn about three of the top mobile application security threats facing businesses today and recommendations on how to mitigate the risk.
Combat fraud and increase customer satisfaction
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.