Cray bags $21m Cascade super deal down under
Xeon, Xeon Phi hybrid to do radio astronomy
Supercomputer maker Cray has bagged a $21m contract to supply the Perth, Australia, Pawsey Centre for supercomputing which will be used to run simulations for geology, life sciences, and nanotechnology research, as well as support radio-astronomy workloads that are the organization's main work.
The contract was awarded to Cray by the Commonwealth Scientific and Industrial Research Organisation (CSIRO), the Australian national science agency, which opened up the bidding process for a petaflops-scale supercomputer system last November, seeking to add more capacity to the current supercomputers run by iVEC, a consortium of the Australian Resources Research Centre (ARRC) in Technology Park, Kensington as well as the University of Western Australia, Curtin University, Edith Cowan University, and Murdoch University.
iVEC wants to boost research in high-energy physics, medical imaging, mining and petroleum engineering, architecture and construction, and urban planning, as well as in the areas cited above. To so do in the modern era requires substantial computing resources, and CSIRO helps come up with the money to pay for the petaflops.
The "Epic" system already built by iVEC is an HP ProLiant blade cluster that has 9,600 cores across its 800 nodes, which are linked by QDR InfiniBand switches. The machine is installed at Murdoch University and went into production last June, and is rated at 87 teraflops.
The new Cray system chosen by CSIRO for the iVEC consortium is a double win for the company, with CSIRO picking a Cray "Cascade" system for number crunching and its Sonexion Lustre-based clustered disk arrays for data storage. The multi-year, multiphase deal has a value of $21m over its term.
iVEC began construction of the Pawsey Centre in Perth at the beginning of next year, and the facility has 2.3 megawatts of juice allocated for compute, 425 kilowatts for disk storage, and 25 kilowatts for tape backup libraries. The plan is for the facility to be ready to accept hardware by November.
iVEC's petaflops-housing Pawsey Centre data center
Cray is being commissioned by the US Defense Advanced Research Projects Agency to build the Cascade system, which uses a new generation of high radix router interconnect, code-named "Aries" and offering a substantial performance and scalability boost over the current "Gemini" XE interconnect used in Cray XE6 and XK6 supercomputers.
The Aries interconnect lashes together supercomputer nodes through PCI-Express 3.0 buses instead of point-to-point interconnects for SMP and NUMA clustering, such as Intel's QuickPath Interconnect or Advanced Micro Devices' HyperTransport, and is therefore in theory at least processor agnostic.
Current XE6 systems use HyperTransport links to reach out to the XE interconnect, and in the XK6 systems, the AMD SR56XX chipsets are used to link Nvidia Tesla GPU coprocessors to the processors, which in turn are linked through the XE interconnect.
Thus far, Cray has only confirmed that the Cascade machines will use a future Xeon processor from Intel as its main engine, and said in June that it would also support Intel's x86-based Xeon Phi coprocessors, formerly known as "Knights Corner" or the Many Integrated Core (MIC) coprocessor.
There's no reason that the Cascade machine cannot support Opteron processors and Tesla coprocessors, but thus far Cray has not confirmed this will happen. AMD is no doubt in the penalty box after a number of processor delays in the past decade that caused Cray much financial angst.
Cray is buddying up to Intel these days, especially after Intel bought the supercomputer interconnect business from Cray back in April for $140m. Cray hasn't said diddly-squat about Teslas or Opterons since that moment.
You figure it out – and it don't take no supercomputer.
DARPA gets the first Cascade system by the end of this year, and Cray is also building out the "Titan" 20-petaflopper based on AMD's Opterons and Nvidia's "Kepler" GPU coprocessors going into Oak Ridge National Laboratory, and the similarly-sized "Blue Waters" super going into the University of Illinois by the end of the year, too.
With Cray handing over the development of the next-generation "Shasta" interconnect to Intel through the sale of the interconnect business, Cray will focus on the system design, packaging, and software stack. As far as hardware is concerned, Cray has basically become Dell or HP. Cray gets to sell machines based on Gemini and Aries interconnects between now and 2017, when the Shasta interconnect and related systems are expected to go on sale, and it is hard to imagine Cray will be given much of a lead over other vendors who will also want to peddle Shasta machinery.
For now, Western Australia will be entering the petascale era over the next year or so, and the Cascade machine will at first be used to support the data-intensive radio astronomy simulations based on streams of bits coming off the Australian Square Kilometer Array Pathfinder (ASKAP) and Murchison Widefield Array (MWA) radio telescopes.
In a statement, iVEC said that the first phase of the Pawsey machine – named after Australian engineer and radio astronomer Joseph Pawsey – would be installed in 2013 and would have 300 teraflops of oomph. It is expected that the Cascade machines will support the "Ivy Bridge" Xeon processors, but Intel and Cray have not confirmed that.
The second phase of the Pawsey machine will be expanded to 1.2 petaflops of raw floating-point performance (double precision), and will according to iVEC will be based on a mix of Ivy Bridge and "Haswell" Xeon processors as well as Xeon Phi coprocessors.
The total AU$33m (US$33.7m) procurement includes Nexus 7000 and 5000 switches and routers from Cisco Systems, supplied by channel partner L7, and Palo Alto Networks firewalls, delivered by channel partner O2 Networks.
There is also a system from Silicon Graphics, named "Fornax" after the constellation in the southern hemisphere, that is part of the overall CSIRO deal. Fornax is a 100-node Xeon 5600-Tesla C2075 hybrid system that was acquired by iVEC last fall; it also includes a 900 TB Lustre clustered file setup from SGI, and a two-rail InfiniBand network for server clustering and storage access. The Fornax machine was installed at the University of Western Australia last September.
The 'Fornax' SGI ceepie-geepie supercomputer
The feeds and speeds of the Cascade machine installed at Pawsey Centre were not divulged, but it looks like it is about half the size of the Cascade-Sonexion setup that Lawrence Berkeley National Laboratory, one of the US Department of Energy's big supercomputing labs, said it was acquiring a month ago.
That machine, called NERSC-7, weighs in at two petaflops of Cascade compute and six petabytes of Sonexion storage (and based on a future generation of the Sonexion arrays), and costs $40m over its multiple-year contract, including software and support. If you assume about half the contract is for compute and half for storage, then Pawsey is getting around 3.1 petabytes of Sonexion arrays.
It is truly astounding how much less expensive supercomputing performance is today than it was a decade ago. The "Red Storm" supercomputer, Cray's first massively parallel machine based on Opterons and the "SeaStar" interconnect that is the granddaddy of the Aries interconnect, cost $90m and delivered 43.5 teraflops of peak performance in its initial configuration. That works out to a little over $2m per teraflops.
The Berkeley and Pawsey Cascade machines are somewhere in the neighborhood of around $8,000 to $10,000 per teraflops, depending on how you want to estimate the cost of the compute power on the machines. That's a factor of over 200 improvement in bang for the buck from 2004 to 2013. ®
Because Intel got there first. They've bought the Cray interconnect technology, which is pretty much all the interesting IP that Cray had left.
And Crays nowadays are pretty much "just a bunch of x86 nodes clustered" anyway.
I wonder why...
IBM doesn't acquire Cray and consolidate the supercomputer market, outside of those supercomputers which are just a bunch of x86 nodes clustered.
21mil doesn't sound like much for kit if Cray are involved.