Feeds

Cray mimics Ethernet atop SeaStar interconnect

Linux shortcut cooks with SLES

7 Elements of Radically Simple OS Migration

Preview of things to come

Bolding warns that at the moment this is a "feature release," which means the Cluster Compatibility Mode is really a technology preview. He adds that the clone TCP/IP stack riding atop the SeaStar interconnect "can provide reasonable performance for a relatively small number of nodes," but cautions that on very large XT implementations, customers are going to want to fall back on what Cray is now calling Extreme Scalability Mode - recompiling the Linux applications to have their nodes talk directly through the SeaStar interconnect. CCM can scale to 2,048 cores on the TCP/IP stack now, which means somewhere between 85 and 170 nodes, depending on the Opteron processors customers choose.

Next year, Cluster Compatibility Mode will get a whole lot more interesting, when Cray supports the OpenFabrics Enterprise Distribution (OFED) drivers for InfiniBand much as it is doing for TCP/IP drivers today. One of the key features of InfiniBand that Ethernet still does not have (but soon will) is called Remote Direct Memory Access, which allows server nodes to talk to each other directly, using InfiniBand controllers to link memory controllers, bypassing the network stack entirely and offering much lower latency than even 10 Gigabit Ethernet. In essence, with support of the OFED drivers, Cray's Cluster Compatibility Mode will allow the SeaStar interconnect to emulate InfiniBand and yield much better performance than the emulated TCP/IP stack being offered initially with CLE 3.0.

"Once we have the OFED drivers, we think we can come very close to our native communications speed," says Bolding. No word yet on how far the emulated InfiniBand will scale in terms of processor nodes, but it has to be pretty far to bother to go to the trouble.

Cray has been working on Cluster Compatibility Mode for the past two years, and Bolding admits that this clever network emulation would have been useful for Cray to expand its addressable market. But at the time, Cray was more concerned with breaking the petaflops barrier at the big supercomputing centers like Oak Ridge National Laboratory that are paying the current bills.

Cray has high hopes for Cluster Compatibility Mode. "We think this will take away the fear of getting a Cray system," explains Bolding. "We have removed cost as a concern over the past few years, and when we did, some customers feared that they would end up getting something that was not compatible with other Linux machines."

Well, of course, the customers were right in this regard. But if the OFED drivers running atop SeaStar and emulating InfiniBand work as well as Bolding says they can, this would indeed be another barrier down. Provided the SeaStar interconnect has enough oomph that emulated InfiniBand performs as well or better than the real thing, of course. By the way, the emulated Ethernet and InfiniBand drivers will support multiple MPI stacks, so you are not locked in.

CLE 3.0 will initially only ship on the new XT6 and XT6m parallel supers, which use blade servers based on the brand-new twelve-core "Magny-Cours" Opteron 6100 processors from Advanced Micro Devices. The XT6 nodes were previewed last fall at the SC09 supercomputing trade show; Cray has not said that the XT6 nodes are actually shipping yet in volume.

Later this year, Cray will support CLE 3.0 on the XT5 supers, which are based on an earlier six-core Opteron generation but which are based on the same SeaStar2+ generation of interconnect that was held over for the XT6 nodes. In early 2011, Cray will support CLE 3.0 on XT4 generations of supers, but has no plans to support it on XT3 machines. It is a matter of testing and qualification, which Cray is not going to spend money on with so few of these XT3 machines still in the field. If you want to run an emulated Ethernet-MPI stack on top of an XT3 machine, you have to move up to an XT4 or higher.

Presumably the combination of the upcoming "Gemini" interconnect and the XT6 nodes, which comprise the "Baker" family of Opteron supers machines slated for later this year, will have some sort of hardware assistance for helping speed up the emulated Ethernet or InfiniBand that the Cluster Compatibility Mode offers inside CLE 3.0. Bolding did not say.

CLE 3.0 has a number of other enhancements. First, it includes Oracle's open source Lustre 1.8 clustered file system, and also supports IBM's Global Parallel File System (GFPS) and Panasas clustered file systems. GPFS and Panasas are new; the Cray XTs have been running Lustre since their inception. CLE 3.0 is also designed to scale across 500,000 cores in a parallel cluster, up from a 200,000-core ceiling with CLE 2.0. CLE 3.0 also includes a diagnostic tool called NodeKARE, short for Node Knowledge and Reconfiguration, which makes sure jobs are scheduled to run only on nodes that are behaving themselves and not acting all wobbly.

What Cray has not said is whether or not it will be offering a Cluster Compatibility Mode in conjunction with Microsoft for its XT line of supers. This would be clearly very useful. Although Cray supports Windows HPC Server 2008 on its baby and midrange lineup, this Windows variant is not supported on the massively scalable XT line. But over the long haul, that will have to be a goal for the company, since the point of having an entry and midrange super line is to get he customers and grow them up to full-scale, massively parallel machines as their workloads expand. ®

Best practices for enterprise data

More from The Register

next story
Microsoft's Euro cloud darkens: US FEDS can dig into foreign servers
They're not emails, they're business records, says court
Sysadmin Day 2014: Quick, there's still time to get the beers in
He walked over the broken glass, killed the thugs... and er... reconnected the cables*
VMware builds product executables on 50 Mac Minis
And goes to the Genius Bar for support
Multipath TCP speeds up the internet so much that security breaks
Black Hat research says proposed protocol will bork network probes, flummox firewalls
Auntie remains MYSTIFIED by that weekend BBC iPlayer and website outage
Still doing 'forensics' on the caching layer – Beeb digi wonk
Microsoft says 'weird things' can happen during Windows Server 2003 migrations
Fix coming for bug that makes Kerberos croak when you run two domain controllers
Cisco says network virtualisation won't pay off everywhere
Another sign of strain in the Borg/VMware relationship?
prev story

Whitepapers

7 Elements of Radically Simple OS Migration
Avoid the typical headaches of OS migration during your next project by learning about 7 elements of radically simple OS migration.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
Solving today's distributed Big Data backup challenges
Enable IT efficiency and allow a firm to access and reuse corporate information for competitive advantage, ultimately changing business outcomes.
A new approach to endpoint data protection
What is the best way to ensure comprehensive visibility, management, and control of information on both company-owned and employee-owned devices?