Data centers to cut LAN cord?
60GHz wireless links ease east-west traffic jams
People are using cell phones and killing their landlines. We have wireless networks in the home to connect our myriad devices. And maybe wireless is coming to the data center, as well.
In days of old, servers kept to themselves pretty much in the data center, and interacted with end users out there on the Internet and the internal networks of the world – what is called north-south traffic in network lingo. But with modern workloads, with virtual servers and distributed computing architectures and large amounts of replication, as much as 80 to 85 per cent of traffic on networks behind cloudy infrastructure can be east-west – that is, traveling between servers or racks of servers, and therefore clogging networks as they try to do their work.
According to a recent article in The New York Times, data center and networking techies are playing around with 60GHz wireless networking for short-haul links to give rack-to-rack communications some extra bandwidth for when the east-west traffic goes a bit wild.
The University of Washington and Microsoft Research published a paper (PDF) at the Association of Computing Machinery's SIGCOMM 2011 conference late last year about their tests of 60GHz wireless links in the data center. Their research used prototype links that bear some resemblance to the point-to-point, high bandwidth technology known as WiGig, which among other things is being proposed as a means to support wireless links between Blu-ray and DVD players and TVs, replacing HDMI cables.
The 60GHz spectrum is especially interesting for data centers because it has over 80 times the aggregate bandwidth available compared to 2.4GHz 802.11b/g wireless networks like we have in our homes and offices.
The problem, explain the Microsoft and Husky techies, is that data center designers are cheapskates and they often oversubscribe their networks to the tune of a ratio of 1:4 between 40 servers using Gigabit Ethernet links in a top-of-rack switch that feeds up to a 10GE aggregation switch. This works fine so long as there is a lot of north-south traffic from servers on up through the top of racker and on out to the aggregation switch.
But when servers within the rack need to chat with each other a lot – as is often the case with distributed workloads – the top-of-rack switch quickly gets saturated. The answer is not to cut all the cords in the data center and replace wired links with fatter wireless links, but rather to create what the researchers call flyways based on point-to-point links between the racks, that can be plugged into the TOR switches and used to allow for rack-to-rack communication when the nodes within the rack or traffic across the aggregation switch is too large.
The devices tested by UW and Microsoft employ the evolving IEEE 802.11ad/WiGig standard, and delivered data rates that ranged from 385Mb/sec to 6.76Gb/sec using prototype radios from HXI. The radios and a mix of fixed beam directional antennas were attached to the top of racks, and essentially constitute another aggregation layer for server-to-server communication. But the researchers predict that they will eventually want to test electronically steerable phased-array antennas so they can switch which racks are connected to others on the fly.
Perhaps the most interesting observation in the Microsoft data center where the wireless pipes were tested – it had sixteen rows of servers with ten racks each, arranged in a four by four grid of rows comprising about 196 square meters of floor space – is that the typical data center running these modern workloads have networking hotspots where the links get saturated. What this means is that the wireless backup bandwidth would only be necessary for a small number of rack-to-rack hops at any time.
The upshot, according to the paper, is that a cheap wireless point-to-point link can cope with heavy data center traffic jams in 95 per cent of the cases, reduce traffic workloads by 45 per cent, and still let companies oversubscribe their networks – and therefore not spend big bucks on more TOR and aggregation switches to avoid this oversubscription. ®