Cray kicks out midrange XE6m super
Gonna wake up in a Gemini dream
SC10 At the SC10 supercomputing extravaganza in New Orleans on Monday, Cray rolled out its next iteration of midrange boxes, the XE6m.
Not everyone needs a petaflops supercomputer — or more precisely, not everyone can afford one. And so, like prior midrange machines, the Opteron-based server nodes in Cray's blades are designed to scale only so far and therefore offer their flops at a cheaper price.
For many workloads, midrange supers using proprietary interconnects are a better fit than commodity x64 servers and either InfiniBand or Ethernet networking across a cluster. That's the thinking behind Cray's midrange XT5m, XT6m, and now XE6m midrange supers.
The XE6m supers employ the same basic technology as their larger siblings, the XE6 supers that were launched gradually over the past year. The XE6, known formerly by the code name "Baker", are comprised of four-node blades based on two-socket Opteron 6100 servers.
The XE6 blades (which are identical to the XT6 blades) support either 32GB or 64GB of main memory per server node. The back-end of the XT6 blade supports a replaceable interconnect unit that can snap in either a SeaStar2+ module used in the XT6 machines or the new "Gemini" interconnect module that was the last-deliveredcomponent of the XE6 system.
The Gemini interconnect has one third or less the latency of the SeaStar2+ interconnect, with it taking a little more than one microsecond to jump between computer nodes hooked to different Gemini chips, and less than one microsecond to jump from any of the four processors talking to the same Gemini interconnect inside the Baker blade module.
The Gemini interconnect includes a high-radix router and can deliver about 100 times the message throughput of the SeaStar2+ interconnect — something on the order of 2 million packets per core per second, according to Cray. And that extra bandwidth means the full-blown XE6 systems can scale a lot further than the XT6 machines and the SeaStar2+ interconnect. The XE6 machines use a 3D torus interconnect scheme and scale from 100 teraflops up to multiple sustained petaflops; an entry machine costs around $2m.
Cray's XE6m midrange supers: a 2D chip off the new 3D block
Like prior m-class supers from Cray, the XE6m midrange box is based on a much less scalable 2D interconnect scheme that tops out right about where the full-blown XE6 machine starts. The reason to buy the XE6m rather than the XT6m, which uses the same twelve-core Opteron 6100 processors in the blades, is therefore not to get more scalability but to get lower latencies between the nodes inside the blades and across the blades than is possible with the SeaStar2+ interconnect used in the XT6m.
The XE6m is designed to scale from 700 to 13,000 cores; each XE6 and XE6m blade holds 96 cores and each rack holds 2,304 cores. Typical configurations range from about 10 teraflops to over 100 teraflops in performance with prices that scale from $500,000 to $3m.
The XE6 machine can scale to well over 1 million cores using the current generation of Opteron processors and has a theoretical limit that is closer to 3 million cores, assuming AMD gets its sixteen-core "Interlagos" Opteron 6200 chips out the door next year. Such a machine would require about 1,000 server racks and would kiss the 10 petaflops performance level.
Both the XE6 and XE6m Cray boxes run the Cray Linux Environment 3.0 operating system, which is a modified version of Novell's SUSE Linux Enterprise Server 11 that is tweaked to run on the Opteron blades and their SeaStar and Gemini interconnects.
This Linux has two modes: Extreme scalability mode runs the Cray interconnect natively and offers the best performance. The clever Cluster Comparability Mode, which launched with CLE 3.0, emulates an Ethernet link over the Gemini interconnect, which means customers can pay a little performance penalty and not have to recompile their code to run specifically on Cray iron. ®