Who's going to turn base Ethernet into gold?
Tackling drops and delays
Data Centre Ethernet (DCE) is the great white hope of convergence, the single über-network across which all other protocols will flow, simplifying network component acquisition and operating costs - but it's not that simple.
Ethernet is a fragile, unreliable base for such a role; it drops packets and message transfer time across the network is not predictable. Network links such as the Fibre Channel ones between servers and block-access storage devices will break if packets (data in frames) are lost and if message transfer takes too long.
How can the base metal that is Ethernet be transmuted into gold? The alchemists are to be found inside the IEEE standards organisation and are working on three 802.1 committees, known as the Qau, Qbb, and Qaz workgroups. David Law, a 3Com consultant engineer and chair of the 802.3 committee, explained the background.
First of all, he said, a simple point-to-point Ethernet link does not drop packets and message latency is predictable. When this pure Ethernet link is complicated by switches at either end combining other links and pumping their data packets along our original link, then it can get overwhelmed, packets can be dropped and messages take longer to traverse it. It's the switches where congestion forms and the switches will have to be involved in solving it.
Separate IEEE 802.something committees cover different aspects of the Ethernet stack. The 802.3 committee which Law chairs deals with the peer-to-peer stuff, "above the MAC". An 802.1 committee looks at switching and an 802.11 one covers Wi-Fi.
In order to have an area of an Ethernet network that is lossless (ie doesn't drop packets) and has deterministic latency (so messages don't take too long to cross the network from sender to receiver), then there will have to be bridges between the DCE domain and common, everyday Ethernet. Such data centre bridging is a project in the 802.1 world where the three committees mentioned above are located. Three committees are needed to deal with the two problems; two for packet loss and latency, and one for DCE-class device identification.
They are tasked with devising solutions to the aspect of the DCE problem they have been allocated that will command broad support in the networking industry and become a standard, permitting different suppliers' kit to interoperate in a DCE-cless network.
The three committees
The IEEE 802.1 Qau Congestion Notification Committee deals with the detection of imminent congestion. A DCE switch or Congestion Point (CP) monitors the queue of outgoing packets and samples packets. If the queue depth exceeds a set length then a congestion notification message (CNM) to a sender, the packet origination or reaction point (RP), in effect telling it to throttle back its packet transmission rate. A rate limiter in the RP reduces the frame rate by the desired amount in the CNM.
There is a problem here in that congestion detection and correction has its own latency. By the time congestion is detected packets are already in the queue. It is possible that a sudden burst of packets due to, say, a surge in server traffic, could overwhelm a switch before the congestion detection monitoring feedback loop has a chance to start working. This is where the Qbb Priority-Based Flow Control committee comes in.
It deals with priority-based flow control, the bus lanes or multi-vehicle occupancy on the motorway. A certain proportion of a link's bandwidth can be set aside for specific traffic. If there is congestion build-up its impact on important traffic can be limited in this way so that a sudden storm surge of packets is kept outside the guaranteed bandwidth section of the link the loss-less part. Thus the packet drop and latency problems are dealt with by a combination of the Qau and Qbb committees' work.
The third committee, the 802.1 Qaz Enhanced Transmission Selection project, deals with DCE-class device identification. How does an Ethernet switch which is DCE-capable know that any other switch it is in contact with is a DCE-capable one too? The existing Ethernet priority scheme, IEEE Standard 802.1Q, is inadequate, because no minimum bandwidth is provided for any traffic class. The Qaz project uses a DCB (Data Centre Bridge) Capability Exchange Protocol (DCBX) to accomplish this.
A Qaz project slide says: "Using priority-based processing and bandwidth allocations, different traffic classes within different traffic types such as LAN, SAN, IPC, and management can be configured to provide bandwidth allocation, low-latency, or best effort transmit characteristics."
This is based, Law says, on the IEEE STD 802.1AB Link Layer Discovery Protocol (LLDP).
If two Ethernet devices declare they are DCE-capable then the traffic between them can be lossless and of predictable latency, using the Qau and Qbb control mechanisms. They form a cluster of DCE-class Ethernet devices, a DCE cloud within the general Ethernet.
Sponsored: RAID: End of an era?