US weather boffins fire up 'Yellowstone' 1.5 petaflopper

Don't expect the short-term forecast to improve

Boost IT visibility and business value

The US National Center for Atmospheric Research admitted a year ago that it had fallen behind in the flops race versus the weather boffinry in other countries, and shelled out tens of millions of dollars to build a new massively parallel Xeon E5-based cluster. The machine, dubbed "Yellowstone" because it is installed in a shiny new data center in Cheyenne, Wyoming, near the famous national park, was built by IBM and has been fired up this week.

IBM and NCAR announced the Yellowstone system contract back in November 2011, ahead of Intel's Xeon E5 processor launch, and the machine that NCAR actually installed has a little less oomph than expected and a slightly different configuration after some rejiggering in February of this year.

The Yellowstone machine is based on IBM's iDataPlex hybrid rack-blade system design. The machine has 4,518 of IBM's dual-socket dx360 M4 server nodes, and is using eight-core Xeon E5-2600 processors running at 2.6GHz.

These processors can do eight floating point operations per clock cycle thanks to the Advanced Vector Extension (AVX) instructions, making the Xeon E5s very suitable for supercomputing applications. Each node is configured with 32GB of DDR3 memory running at 1.6GHz, for a total of just under 146TB of main memory across the cluster.

The cluster has 72,288 cores in total, which as lashed together through an InfiniBand network from Mellanox Technology that runs at FDR (56Gb/sec) speeds. It is set up in a full fat tree configuration in a single plane that delivers bidirectional bandwidth of 13.6GB/sec with a 2.5 microsecond latency; the peak bisection bandwidth of the network is 31.7TB/sec.

IBM and NCAR say that the Yellowstone has a peak aggregate theoretical performance of 1.5 petaflops and expect for the machine to deliver around 1.2 petaflops running the Linpack parallel Fortran benchmark, which would yield a computational efficiency of 81.3 per cent – not too shabby, and one reason HPC customers are upgrading their processor nodes and networks. That's just under 30 times the performance of the current water-cooled "Bluefire" Power 575 cluster built by IBM for NCAR.

The original machine was slated to have 74,592 cores across 4,662 nodes, for a total of 1.6 petaflops. The machine was supposed to be installed and running by summer, but took a bit longer. The good news for NCAR is that it took less iron to get to the performance level it was after.

The Yellowstone super built by IBM for NCAR

The Yellowstone super built by IBM for NCAR

The Yellowstone deal also includes a new file system and storage array plus data analysis and visualization clusters. The storage system is called "Glade" and is comprised of 76 of IBM's DCS3700 arrays with a total of 4,560 3TB disk drives and delivering a total of 10.7PM of usable capacity and 90GB/sec of aggregate I/O bandwidth into and out of the clustered file system.

It runs IBM's General Parallel File System (GPFS), of course, which is just about the only practical alternative to the open source Lustre clustered file system for large HPC machines. The DCS3700 has 60 drives per subsystem and in the first quarter of 2014, NCAR will be adding another 30 3TB drives per cabinet, boosting capacity to 16.4PB.

The data analysis cluster is called "Geyser" and it has sixteen of IBM's quad-socket System x3850 machines using ten-core Xeon E7 processors running at 2.4GHz with 1TB of main memory in each node. The visualization system also has sixteen nodes, in this case based on iDataPlex dx360 M4 nodes just like Yellowstone itself and have 64GB of memory and two Nvidia Tesla GPUs per node. Presumably, these are the future K20 GPU coprocessors – NCAR did not say.

It did say that next month, it plans to add a sixteen-node system using Intel's Xeon Phi x86-based parallel coprocessors. All four systems are linked by FDR InfiniBand on that full fat tree.

There are 74 racks for compute, 20 racks for storage, three racks for data analysis and viz, and three racks for test systems.

NCAR Wyoming Supercomputing Center

NCAR's Wyoming Supercomputing Center

Last year, NCAR told El Reg that depending on the features that it chose, the Yellowstone deal could cost somewhere between $25m and $35m to build. Neither IBM nor NCAR divulged the final price of the system on Tusday.

They did say that the design, construction, and commissioning of the Yellowstone system, its add-ons, and the data center had a budget of approximately $70m. The National Science Foundation coughed up $57.6m of that, and the University of Wyoming is committing $20m over two decades to get 20 per cent of the capacity of the system over that term. The state of Wyoming coughed up another $20m to help with the construction of the facility.

The NCAR-Wyoming Supercomputing Center is, as El Reg guessed a year ago, using outside air cooling to keep the Yellowstone system from melting; no word on whether it is tapping geothermal power from the nearby "Old Faithful" geyser in Yellowstone National Park to power this puppy.

The data center weighs in at 153,000 square feet, but only 12,000 of that is for the 100 racks for computers. The people, power and cooling systems eat up the rest of the space. The NCAR-Wyoming data center can deliver somewhere between 4 and 5 megawatts of juice, and with all of the systems humming and the few dozen staff on site, the center burns between 1.8 and 2.1 megawatts. ®

The essential guide to IT transformation

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Oracle reveals 32-core, 10 BEEELLION-transistor SPARC M7
New chip scales to 1024 cores, 8192 threads 64 TB RAM, at speeds over 3.6GHz
Docker kicks KVM's butt in IBM tests
Big Blue finds containers are speedy, but may not have much room to improve
US regulators OK sale of IBM's x86 server biz to Lenovo
Now all that remains is for gov't offices to ban the boxes
Gartner's Special Report: Should you believe the hype?
Enough hot air to carry a balloon to the Moon
Flash could be CHEAPER than SAS DISK? Come off it, NetApp
Stats analysis reckons we'll hit that point in just three years
Dell The Man shrieks: 'We've got a Bitcoin order, we've got a Bitcoin order'
$50k of PowerEdge servers? That'll be 85 coins in digi-dosh
prev story


5 things you didn’t know about cloud backup
IT departments are embracing cloud backup, but there’s a lot you need to know before choosing a service provider. Learn all the critical things you need to know.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.
Rethinking backup and recovery in the modern data center
Combining intelligence, operational analytics, and automation to enable efficient, data-driven IT organizations using the HP ABR approach.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.