Cray descends to midrange HPC shops with baby XC30 supers
An air-cooled chip off the relatively new 'Cascade' block
Supercomputer maker Cray had been hinting that it would deliver a new cut-down version of its "Cascade" XC30 system, and the machine is being unveiled on Tuesday at the Cray User Group meeting in Napa Valley, California.
The XC30-AC machines go into more standard cabinets like those used with the XE5 and XE6 predecessors to the Cascade boxes, but are based on the same Intel Xeon E5 processors and "Aries" Dragonfly interconnect that the larger XC30-LC machines that were announced last November. The LC is short for liquid-cooled, and you will recall that the big bad Cascade boxes had liquid cooling in the racks and an interesting "transverse cooling" system.
The Cascade machine and the Aries interconnect were developed in conjunction with the US Defense Advanced Research Project Agency, which footed the entire bill for its development, and is currently owned by chip maker Intel, which acquired the interconnect in April 2012 for $140m. The DARPA funds were booked as an offset against research and development, not revenue, which was pretty sweet for Cray's books.
Cray got $43.1m in phase one of the Cascade projectin 2003, and then Cray got a contract for another $250m for phase two of the Cascade projectin 2006 to design the Cascade system (which was meant to mix and match x86, FPGA, vector, and ThreadStorm processors), the Aries Dragonfly interconnect, and the Chapel programming environment for the machine. In January 2010, DARPA cut back on the Cascade project, ditching the custom processor that was originally part of Cascade (but which no one talked about at the time) and chopping $60m from the contract.
Still, think of how sweet this is. You get a deal from Uncle Sam for a combined $233.1m to develop two generations of machines and interconnects, and then you sell the technology off for another $140m to Intel and you get to build machines for maybe six or seven years using this technology, raking in on the order of maybe $3.5bn in revenues on that technology.
America, what a country! This is precisely how the phone company charges us for the internet that our tax dollars paid to create in the first place. It's good work if you can get it.
The changes in phase two of the DARPA contract also created a hybrid interconnect half-way between the SeaStar+ used in the XT systems and the Aries interconnect used in the XC systems, which was launched as the "Gemini" XE interconnect used in the XE5, XE6, and XK7 machines.
The SeaStar and Gemini interconnects plugged into the HyperTransport ports of the Opteron processors from AMD, while the Aries interconnect plugs into PCI-Express 3.0 ports and is therefore restricted to Xeon E5 processors, which support the most current PCI bus; Opterons are still stuck at PCI-Express 2.0. (Whoops. That was not very forward thinking, was it?)
DARPA wanted PCI-Express as the means to link processors and coprocessors to the interconnect because it is more generic than HyperTransport, and you can bet that this is precisely what Cray wanted after tying itself too tightly to the Opteron chip and having its revenue stream messed up more than once by Opteron delays.
The XC30-AC is less dense, air-cooled, and less expensive than their liquid-cooled
XC30-LC older sibling
The new XC30-AC machines are aimed at the same kind of midrange HPC shops who need a modest super rather than a big bad box. As Barry Bolding, vice president of marketing at Cray, explains it to El Reg, not every organization can build and operate a supercomputing center that costs $100m over its five-year life. But similarly, not every application runs best on a generic x86 cluster using Ethernet or InfiniBand interconnects, no matter how much Cray wants to peddle its CS300 (formerly Appro XtremeX blade supers).
Bolding says that for jobs that are running on 512 cores and above, the XC30 machine, whether you are talking about the midrange or high-end box, really shines. "If the machine is doing a ton of 500-core or smaller jobs, then InfiniBand is better," Bolding says. "If you are running a bunch of 500-core jobs and then you need to run a 5,000-core job every now and then, the Aries interconnect is better."
With the XC30-AC, you don't need to deal with liquid coolants or any data center infrastructure relating to liquid cooling, and you don't need a raised-floor data center environment, either. The Cascade compute blades are oriented vertically in the chassis and racks, and up to sixteen compute blades fit into the rack. The big Cascade machine puts the blades in horizontally and packs them in roughly twice as densely. The XC30-LC cabinets are 50 per cent larger, so you can get three times as much number-crunching power into the LC cabinets (66 teraflops) compared to the AC cabinets (at 22 teraflops).
The AC machines only use the in-chassis backplane and the electrical cables of the Aries Dragonfly interconnect and only scale up to eight cabinets and 1,024 sockets. The full Aries router is put into these machines, and you could, in theory, link up the optical interconnects that the larger machine makes use of to scale up to 482 cabinets in a single system with just over 185,000 sockets.
But in practice, Bolding says that an upgrade between the AC and LC machines is not available, and if you want to grow beyond 1,024 sockets and about 176 teraflops, you would buy the new cabinets and move the blades and Aries chips over to the new racks and build up an LC machine.
The bottom of the XC30-AC rack has a single fan that sucks cold air off the floor of the data center and blows it up through the blades, which heat it up; the hot air exhausts out of the top of the rack to be rechilled. With the full-on Cascade, Cray came up with a transverse cooling system that puts water chillers in the racks and blows chilled air from one end of the row to the other, and the exiting air at the end is at the same temperature as the intake air at the other end.
This transverse cooling ensures that all blades (and therefore their processors) are running at the same temperature, which is important for a data center with hundreds of racks and a need to not create hot and cold aisles. But it is way overkill for a midrange customer that might have from one to eight racks to run much more modest workloads.
Another big difference between the midrange and big-iron versions of the Cascade machines is that the air-cooled version has 208-volt or 480-volt options on rack power, while the big box only support 480-volt juice.
With the prior XE5m and XE6m midrange supers, Cray backstepped from a 3D torus to a 2D torus topology using the Gemini interconnect, and there was a definite performance hit from doing so. But with the Dragonfly approach, all processors are linked to all other processors (not directly of course, but with no more than five hops between any two processors) and there is no performance hit moving down to an AC model compared to an LC model. The beauty is that you use the same Cray Linux Environment on both machines and the same compilers and math libraries, too.
The XC30-AC machines are available now and cost from $500,000 to $3m, which works out to around $22,000 per teraflops for a single-racker to around $17,000 per teraflops for an eight-racker.
One interesting note about processor choices. For very large supercomputing centers, having Opterons for the past decade was fine, excepting the occasional delay or bug. But when trying to push down into smaller organizations, Bolding says many have a "buy Intel" philosophy.
Because of this, one big change with the XC30-AC over the XE5m and XE6m midrange supers is that the switch from AMD to Intel processors will automatically increase the total addressable market for the machines by a factor of four or so. The XE5m and XE6m machines were a "moderate success," according to Bolding, and if Cray does two or three times the sales of these boxes with the XC30-AC machines, this will constitute a "whopping success." ®