This article is more than 1 year old

It's all in the fabric for the data centre network

At the beginniing of the SDN revolution, says Trevor Pott

IT infrastructure is worth exactly nothing if the network doesn't work. The network designs we have grown so comfortable with over the past 15 years or so are wholly inadequate if you are building a cloud or merely undergoing a refresh that will see more virtual machines packed into newer, more capable servers.

The traditional network was a hierarchical, oversubscribed affair. Servers would be linked to top-of-rack switch through1Gb ports. That switch would connect to the end-of-row switch via one or more 10Gb ports and the end-of-row switches would talk to the core at 40Gb. Dial the numbers up or down depending on your budget

Trunking allowed you to lash together multiple ports into a single logical link; 10Gb to a single server was considered for some time to be fast enough for converged networking – storage and userspace traffic on the same link.

Things have changed so drastically that it can't be solved merely by turning the knobs on port speeds.

All points of the compass

Get networking types talking about network design and you will inevitably hear them discuss "north-south traffic" versus "east-west traffic".

The nomenclature is derived from the standard hierarchical network design. The core switch was represented as the topmost item on a network diagram, thus it was north. Individual servers – or more accurately the virtual machines and applications they hosted – were at the bottom, or south.

In the early 90s, when the client-server revolution really took off and the World Wide Web was just coming into being, virtually all traffic in a data centre was from servers in the south to users on the other end of that core switch in the north.

A server on the east side of the map wanting to get data from a server on the west side would have to go through several hops to get there. It would have to go to its rack switch, its row switch, the core switch, then another row switch and another rack switch to the destination server and back again.

Typically, each of these links was oversubscribed. You might have 30 servers on a rack each talking to the rack switch at 1Gb. That rack switch would then have only a 10Gb link to the end of the row.

Twenty racks in a row would share a single 40Gb link back to core, and so forth. So long as there wasn't a lot of east-west traffic – and so long as server densities didn't rise too high – this worked quite well.

Fast-moving machines

Things changed. Network traffic became predominantly east-west. Depending on who you talk to you will get different reasons for this shift.

Brocade's Julian Starr believes that a shift towards new application architectures based on message buses, tokenised passing (via things like XML) and similar modern web architectures are responsible .

I argue that virtualisation was the enabler. Before virtualisation we had run all of those pieces on a single server. After virtualisation we started to break out one-application-per-virtual-machine and virtual machines were mobile.

They wouldn't necessarily keep their inter-app conversations within a single host. Regardless of which came first – app model or infrastructure – it was at this point that the change from north-south to east-west really began.

Two major changes affecting networking occurred at the same time: a shift towards centralised storage; and a push to drive server utilisation as close to the red line as possible. All of a sudden servers were talking east-west to get storage traffic while also chattering among themselves for application traffic.

The hierarchical network wasn't dead but it was certainly on life support

To make matters worse, blades and other high-density solutions became popular, converting 30 servers per rack into well over 100. The traditional hierarchical network wasn't dead – there was nothing to replace it with yet – but it was certainly on life support.

QoS and binding ever more ports together into ever wider links bought everyone some time. Storage was kept on its own network, but this was expensive and, more critically, inflexible.

Applications in a hierarchical network are still very siloed; they cannot stray far from their storage, and that storage shares the same hierarchical limitations as the rest of the network.

Workloads were becoming dynamic. Different virtual servers would need access to the same chunk of storage at different times: traditional north-south userspace stuff during the day, big-data number crunching at night and backups in the wee hours of the morning.

As data centre workload complexity grew it became increasingly difficult to keep all this traffic close enough to the data for the network not to represent an undue bottleneck.

Next page: Another fine mesh

More about

TIP US OFF

Send us news


Other stories you might like