Feeds

Mellanox pushes InfiniBand to 120Gb/s

Offloading IB processing from CPUs to ConnectX-2s

The essential guide to IT transformation

SC09 If 40 Gb/sec InfiniBand is not enough for you, then you'll be happy to hear that InfiniBand switch maker Mellanox Technologies is going to crank its switches past 11 to deliver 120 Gb/sec ports in its MTS and IS switch families.

As it turns out, the InfiniScale IV chips that Mellanox created for its 40 Gb/sec IS5000 switches, which were announced in June, are able to support 120Gb/s ports. The company said at the SC09 supercomputing trade show in Portland, Oregon, that in the first quarter of 2010, it will ship a variant of the MTS3600 fixed-port switch that employs CXP connections, which are already used with 40 Gb/sec InfiniBand by a number of vendors, that gang up 12 lanes of 10 Gb/sec InfiniBand traffic into a single port.

The MTS3600 comes in a 1U box that supports 36 40Gb/s ports, and a variant of this box supporting CXP links will come out with 12 ports running at 120Gb/s. The leaf modules in the IS5000 series of modular switches, which came out in June as well, have their port counts cut by a third and their bandwidth per port tripled.

Mellanox was demonstrating the new 120 Gb/sec switches at the show, and the high-speed InfiniBand switches were the backbone of a 400 Gb/sec network supporting the show that SCInet, the network provider for the SC trade shows, slapped together and has a value of $20m if you had to buy it. (The SC event does not have to buy its network, since vendors are thrilled to donate equipment and experts to be part of the high-speed backbone.)

Eventually, the 120 Gb/sec product line will include fixed-port switches with 12, 36, and 72 ports and hybrid 120Gb/sec and 40 Gb/sec switches that have six of the fast ports and 18 of the slower ones.

According to John Monson, vice president of marketing at Mellanox, the price per bit on the 120 Gb/sec variants of the switches will be the same as on the 40 Gb/sec, but the port costs will obviously triple. The reason why HPC shops would want to move to 120 Gb/sec is simple: by ganging up the InfiniBand pipes, users can cut the InfiniBand transport overhead by a factor of three and squeeze more performance out of their clusters. The switches will also allow for more bandwidth to be allocated in 3D torus systems between the cubes that make up the torus.

Enabling the upgrade to 120 Gb/sec is a "golden screwdriver" patch to the firmware, since the InfiniScale IV chips already supported the higher bandwidth, but obviously the CXP ports and their optical cables are different and you have to buy a new box to get them.

Mellanox also said at the SC09 show that the ConnectX-2 host channel adapters it has been shipping had another golden screwdriver upgrade. The prior generation of ConnectX IB cards allowed for some of the processing related to the InfiniBand protocol to be offloaded from CPUs in the server nodes to the IB cards. (Much as TCP/IP has had offloading for Ethernet cards for a number of years now.)

With the ConnectX-2 cards, Monson says that Mellanox has gone one step further and put electronics in the HCA that can take over some of the Message Passing Interface (MPI) collectives operations - those that broadcast data, gather data, or otherwise synchronize the nodes in a cluster.

Mellanox has done benchmark tests that show that clusters lose 20 to 30 per cent of their scalability from this MPI collectives communications, and that means cycles on the CPUs that could be doing other work are stuck doing this MPI work. Hence the offload, which is turned on with a firmware change on the existing cards and which cuts down on jitter and noise in the cluster and lets it get more work done.

Interestingly, the ConnectX-2 IB cards also have an embedded floating point co-processor, which can take over some of the calculation jobs that the MPI stack sends to a server node in the cluster, provided it has a Mellanox IB card.

Mellanox has been working with Oak Ridge and Los Alamos, two of the national laboratories funded by the US Department of Energy, to tweak the MPI stack to see this embedded floating point unit. Monson is mum about when this capability will ship be activated. ®

Boost IT visibility and business value

More from The Register

next story
Pay to play: The hidden cost of software defined everything
Enter credit card details if you want that system you bought to actually be useful
Shoot-em-up: Sony Online Entertainment hit by 'large scale DDoS attack'
Games disrupted as firm struggles to control network
HP busts out new ProLiant Gen9 servers
Think those are cool? Wait till you get a load of our racks
Silicon Valley jolted by magnitude 6.1 quake – its biggest in 25 years
Did the earth move for you at VMworld – oh, OK. It just did. A lot
VMware's high-wire balancing act: EVO might drag us ALL down
Get it right, EMC, or there'll be STORAGE CIVIL WAR. Mark my words
Forrester says it's time to give up on physical storage arrays
The physical/virtual storage tipping point may just have arrived
prev story

Whitepapers

Top 10 endpoint backup mistakes
Avoid the ten endpoint backup mistakes to ensure that your critical corporate data is protected and end user productivity is improved.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Backing up distributed data
Eliminating the redundant use of bandwidth and storage capacity and application consolidation in the modern data center.
The essential guide to IT transformation
ServiceNow discusses three IT transformations that can help CIOs automate IT services to transform IT and the enterprise
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.