Related topics
  • ,
  • ,
  • ,

Sun buffs InfiniBand for Constellation supers

Over two petaflops sold

And on the left coast...

The University of California at San Diego is also using 32-core X4600 machines as the basis of a cluster that has 512 GB of main memory per node, something Sun can't do on its Xeon or Opteron blades. There are some Sparc-based clusters here and there too, particularly in financial services, which are used to run economic simulations as part of trading systems.

The official name of the Project M2 QDR switch is the Datacenter InfiniBand Switch 648, and it has a starting list price of $70,495. The switch fits in an 11U rack chassis and uses 12x consolidation cables to plug into the 4x InfiniBand ports so you only need 216 cables. The chassis can be equipped with up to nine 72-port line cards and up to nine vertical fabric card slots, for a total of 41 Tb/sec of aggregate bi-directional bandwidth.

Up to eight of these switches can be linked together to create an InfiniBand fabric that can span 5,184 ports. With each server node presumably having one port and two sockets with either four or six cores, we're talking about an HPC cluster that can span from 41,472 or 62,208 cores. This is a very large system, on the order of 400 to 500 teraflops, depending on the processor clock speeds.

The top-end 648-port InfiniBand switch is designed and manufactured by Sun, according to Brown, as are the InfiniBand Switch 72 and InfiniBand Switch 36 fixed-port switches that Sun is also showing off at ISC '09 today. These are based on the latest Mellanox chips and feature QDR InfiniBand speeds as well. (The exact specs and prices for these two switches were not available at press time).

At ISC '09, Sun is previewing a new flash disk array with 2 TB of capacity that comes in a 1U chassis and that Sun says has enough I/O per second data bandwidth to replace around 3,000 disk drives, but does so by only burning around 300 watts. Sun is also previewing a new Storage 7000 array to HPC customers that will span up to 1.5 PB in capacity and will have full redundancy - multiple head nodes, multiple interconnects, and such - built in. No word on when these two will ship.

On the HPC software front, Sun is rolling out Luster 1.8.0, which has been tweaked so it understands the flash memory Sun has sprinkled into its open storage arrays. The new Luster clustered file system also has a number of nips and tucks to boost performance and improve usability, including a new adaptive timeout feature and version-based recovery of data stored on the file system.

Sun is also announcing its HPC Software Linux Edition 2.0 software stack, which runs on Red Hat, CentOS, or SUSE Linux. Exactly how this bundle of tools is different from the 1.2 release of the HPCstack from Sun is not clear, since the feeds and speeds are not up yet for it. (You can see all the details about the 1.2 release here).

Sun is also pushing its Grid Engine grid software to Release 6.2 Update 3, which adds the ability to bring compute capacity on Amazon's EC2 compute cloud as well as other internal clouds that are compatible with EC2 into a Grid Engine cluster. Sun's own Studio 12 development tools have been given an Update 1, which has lots of performance tweaks for parallel programming on the latest x64 and Sparc processors, and the HPC ClusterTools 8.2 includes MPI libraries and runtimes that are based on the Open MPI spec, tested and supported by Sun for both Solaris and Linux.

The HPC ClusterTools have also been tweaked to support QDR InfiniBand and IB multi-rail, which is a multipathing technology for InfiniBand that allows a server with two ports to send traffic over both at the same time. The HPC ClusterTools now also offer support for PathScale and Intel compilers as well as Sun's Studio compiler and the open source GNU compilers. Finally, Sun has packaged up some HPC tools and its latest OpenSolaris release into a little something it calls HPC Software Developer Edition 1.0, which gives developers a single CD from which they can get all the tools they need to start coding parallel applications. ®

Sponsored: How to determine if cloud backup is right for your servers