InfiniBand to outpace Ethernet's unstoppable force
Run faster or be crushed
Comment Every good idea in networking eventually seems to be borged into the Ethernet protocol. Even so, there's still a place in the market for its main rival in the data center, InfiniBand, which has consistently offered more bandwidth, lower latency, and often lower power consumption and cost-per-port than Ethernet.
But can InfiniBand keep outrunning the tank that is Ethernet? The members of the InfiniBand Trade Association, the consortium that manages the InfiniBand specification, think so.
InfiniBand, which is the result of the merger in 1999 of the Future I/O spec espoused by Compaq, IBM, and Hewlett-Packard and the Next Generation I/O competing spec from Intel, Microsoft, and Sun Microsystems, represents one of those rare moments when key players came together to create a new technology — then kept moving it forward. Sure, InfiniBand was relegated to a role in high-performance computing clusters, lashing nodes together, rather than becoming a universal fabric for server, storage, and peripheral connectivity. Roadmaps don't always pan out.
But since the first 10Gb/sec InfiniBand products hit the market in 2001, it's InfiniBand, more than Ethernet, that has kept pace with the exploding core counts in servers and massive storage arrays to feed them, which demand massive amounts of I/O bandwidth in the switches that link them. Which is why InfiniBand has persisted despite the onslaught of Ethernet, which jumped to Gigabit and then 10 Gigabit speeds while InfiniBand evolved to 40Gb/sec.
Now the race between InfiniBand and Ethernet begins anew. As El Reg previously reported, the IEEE has just ratified the 802.3ba 40Gb/sec and 100Gb/sec Ethernet standards, and network equipment vendors are already monkeying around with non-standard 100Gb/sec devices. At the SC09 supercomputing conference last fall, Mellanox was ganging up three quad data rate (QDR, at 40Gb/sec) InfiniBand pipes to make a twelve-port 120Gb/sec switch. This latter box is interesting, but it is not adhering to the current InfiniBand roadmap:
InfiniBand is a multi-lane protocol. Generally speaking, says Brian Sparks, co-chair of the IBTA's marketing working group and the senior director of marketing at Mellanox, the four-lane (4x) products are used to link servers to switches, the eight-lane (8x) products are used for switch uplinks, and the twelve-lane (12x) products are used for switch-to-switch links. The single-lane (1x) products are intended to run the InfiniBand protocol over wide area networks.
As each new generation of InfiniBand comes out, the lanes get faster. The original InfiniBand ran each lane at 2.5Gb/sec, double data rate (DDR) pushed it up to 5Gb/sec, and the current QDR products push it up to 10Gb/sec per lane.
Life in the faster lane
Looking ahead, InfiniBand will be moving to 64/66 encoding, which the InfiniBand Trade Association says is a lot more efficient than the 8b/10b encoding. In the new method, you send 66 bits for every 64 bits of data, yielding only slightly more than a three per cent overhead for the protocol, compared to a 25 per cent overhead with the prior encoding. (The lane speeds above include the encoding overhead.) The upshot is that a much larger portion of the bandwidth in an InfiniBand device using the 64/66 encoding will be available for sending data.
Starting in 2011, there will be two new lane speeds for InfiniBand. The first is FDR — which is not short for the 32nd president of the United States, but for Fourteen Data Rate, referring to its 14Gb/sec speed. The second is called EDR, short for Eight Data Rate, which actually runs the InfiniBand lanes at 25Gb/sec.
The FDR line speed was added to allow for the creation of lower-power and less-expensive InfiniBand gear that nonetheless has more oomph than the current QDR products. EDR obviously targets the high end of the bandwidth range, with switch-to-switch 12x products being able to hit 312Gb/sec bandwidth compared to 168Gb/sec for FDR products.
At the server switch level, the 4x ports will deliver 104Gb/sec using the EDR lanes, significantly faster than the 80Gb/sec that was projected a few years back for 2011 on the InfiniBand roadmap for EDR. Sparks says that FDR is aimed mostly at blade servers, where compact packaging and low heat are priorities.
No matter what speed InfiniBand hits, the PCI Express 3.0 bus is going to be the bottleneck. That spec was originally due in late 2009, but has run into delays. The multi-lane PCI Express I/O architecture will support 8GT/sec of bandwidth, up from 5GT/sec with PCI Express 2.0 slots. While PCI Express 3.0 will ditch the 8b/10b encoding, it is not clear if the peripheral standard will use the same 64/66 encoding used by the future InfiniBand protocol, a 128b/130b encoding scheme, or something else.
The InfiniBand roadmap already had a Hexidecimal Data Rate (HDR) plotted out on the roadmap. The InfiniBand cheerleaders did not offer the feeds and speeds for HDR two years ago, as they have not here in 2010, but now they are putting a stake in the ground and saying that HDR, whatever it is, will come out in 2014.
The updated InfiniBand roadmap also includes a pointer to something called NDR, short for Next Data Rate, which comes after that. No word on what NDR speeds might be or when they are expected to be available. The basic curve is to double bandwidth every three years, so that would put HDR at around 50Gb/sec per lane in 2014 and NDR at 100Gb/sec per lane in 2017, with possible slower offshoots, like the geared-down FDR relative to the EDR, in the next generation of HDR and NDR products.
The good folks at the PCI-SIG better get in gear and start getting PCI Express 4.0 and 5.0 on the boards. There's no point in worrying about InfiniBand bandwidth of it will simply choke the PCI bus. Then again, system makers could just do something interesting and go back to the original InfiniBand plan, which was to use InfiniBand for peripherals as well as for switched fabrics.
Wouldn't that just make Cisco CEO John Chambers' day? ®