No more tiers for flatter networks
Solving the east-west traffic problem
There is a disconnect between data centre networks and modern distributed applications, and it is not a broken wire. It is a broken networking model.
The traditional three-tier, hierarchical data centre networks as defined and championed by Cisco Systems since the commercialisation of the internet protocol inside the glass house no longer matches the systems and applications that are running in those data centres.
Designed for the dotcom era, the hierarchical model is not fast enough or cheap enough for the cloud. And that is why so many companies have been picking at Cisco's networking lunch and, in turn, been eaten by server makers who know they need to integrate networking in their systems to remain relevant.
Here's the Cisco view of the networking world, somewhat simplified:
Cisco's hierarchical network model
The three-tier network design has redundancy built into the core and aggregation layers, which are cross-connected for multi-pathing as well as for high availability.
It works well when end-users inside the firewall want to get at an application and can over-subscribe the networks because utilisation on them is generally low.
This low network utilisation goes hand in hand with low server utilisation, which was the norm for two decades. It was more important to isolate workloads on physical servers and give them a permanent home with their slice of the corporate network.
The kinds of data zinging around are fatter and more unpredictable than simple web and email traffic, too.
But what happens when you want to drive up utilisation on servers, usually through server virtualisation, while you keep adding more cores to the chip?
What happens is you saturate your network and the three-tier model starts breaking down – and you start looking at the supercomputing space for some inspiration.
"A lot of the applications coming out today – cloud, Web 2.0 or high-speed financial applications – have concepts from high-performance computing, which means doing things massively and in parallel," says Dan Tuchler, vice-president of product management for IBM's system networking division.
More than a decade after selling off its SNA networking business to Cisco and toeing the Cisco three-tier line, Big Blue spent a rumored $400m to acquire Blade Network Technologies, which makes integrated switches for IBM, Hewlett-Packard, NEC and other blade server manufacturers, as well as top-of-rack switches for rack-based servers.
Tuchler doesn't think companies will start unplugging all of their core, distribution and access tier gear any time soon because they have made large investments, and the core switches allow them to plug in other features like firewalls and security appliances into them.
Pick a leaf
But for companies that need network traffic to move more efficiently at higher bandwidth and with lower latencies, then a leaf-spine network that has a flatter architecture, or perhaps a fat tree network inspired by supercomputers or a Clos network inspired by telecommunications, might be just the ticket.
Despite the extra devices used, they can be managed as one and still provide a lower latency than most core chassis devices.
IBM's conception of a leaf-spine network
The leaf-spine network architecture takes a top-of-rack switch that can reach down into server nodes directly and links it back to a set of non-blocking spine switches that have enough bandwidth to allow for clusters of servers to be linked to each other in the tens of thousands.
Generally speaking, a lot of leaf-spine networks don't do oversubscription like the hierarchical networks do, because of high-bandwidth, low-latency demands. By nature you can easily scale this network design. You can start very small and there is virtually no limit to the number of nodes you can connect.
“What matters for these modern workloads is efficiency and bandwidth”
"We see customers with three and sometimes even four tiers in their networks, and it is not very efficient," says John Monson, vice-president of marketing at Mellanox Technologies.
Mellanox is a networking ASIC, interface card and switch maker with expertise in the InfiniBand switch fabric (which by definition was supposed to be a flat network). It got into the Ethernet switching racket through its $218m buy last November of sometime rival and partner Voltaire.
"Scaling up the same old approach just doesn't work," explains Monson.
"I can't wait forever for a virtual machine to go through four layers of switching to get from one point in the network to another. What matters for these modern workloads is efficiency and bandwidth, and if you have networks oversubscribed and hanging off cores, it doesn't work."
Change of direction
The east-west traffic problem is what is really killing the three-tier network in the data centre.
Traditionally, traffic through data centres flowed up and down through the network in a north-south orientation – from access, distribution and core layers and back again.
But no more. According to recent vendor surveys as much as 80 to 85 per cent of the traffic in virtualised server infrastructure – what we now call clouds – moves from server node to server node.
This is more like supercomputing than serving in the traditional sense and, not surprisingly, the networks are flattening out just like they have in high-performance clusters.
Driving up server utilisation on such compute clusters – whether they run financial trading and risk analysis programs, virtual server clouds or parallel supercomputer applications – requires not just a flatter network, but a faster one.
If you stay with Gigabit Ethernet interconnects on a three-tier network, how can you drive up server utilisation if you move from one to two to four to eight cores per processor? You will add cores, but they will spend more and more of their time waiting for data to come back to them over the network.
By hook or by crook
Mellanox is not seeing data centres tossing out their expensive three-tier networks. But for new applications – setting up a new trading system or a private cloud – they are rethinking the networks that lash the servers and storage together.
They are also, says Monson, podding servers together in pods of 500 or 1,000 machines, and then using bridges and gateways to hook these applications into the older networks.
"On a three-tier network, as you scale up the servers, those servers are spending all of their time waiting," says Monson.
"When you flatten the network, the CPU utilisation goes up, and throughput in the application goes up in big jumps, like a factor of two or three times."
That might mean you can support a given application workload with fewer servers on a new leaf-spine network than you would need on an old three-tier network.
And that is music to the ears of chief executives and financial directors. ®
Not a Standard
The Hierarchical Three Layer Model is just a conceptual framework it is not a standard. Furthermore the article describes the type of networking which is required within a data center not within the enterprise from end to end as the Hierarchical Model does.
Makes No Sense
You level your critisism at the Heirarchical Model which Cisco stopped promoting almost 6 years ago and then outlined solutions which are only relevant to the core not the entire model from end nodes (access layer) to core.
I am no longer active in Network design but I remember the three Level hierarchical Model from my CCNA about 14 years ago. The last time I looked at the Cisco literature about 5 years ago Cisco had moved to what they call the Enterprise Composite Network Model which was a bit more functional and a think a bit similar to the flat model you defined.
Your top of the rack switch scenario sounds like it would fit into what Cisco calls the Server Farm Block in it current enterprise model.
I am sure the Hierarchical Model was never meant to be the solution for every circumstances. I am also sure that if it was relevant to network design in most enterprise over the past few decades it is still as relevant in all but a few today. Also being just a conceptual framework I am sure that it can be used in the scenatio you described, since all you describe is what should happen at the Core. In a sense you are saying that at the core we need to implement the a specific design for today unpredictable and data hungry vitalization and other application.
All from memory because I have not lifted a finger to do any networking for almost five years, but answer these question because it was not answered by your model. Where in your model will you aggregate access devices, where will you implement ACLs, QOS and security. Where will you converge VLANs and broadcast domains.
Cisco moved to a more Functional enterprise model a while back. Your article would be more useful if it truly critiqued the Hierarchical Model in some coherent way rather than just being focused on networking within a Server Rack or data center.
Since the critique was leveled at Cisco generally it would have been useful if it was leveled at the Enterprise Composite Network Model which Cisco currently promotes ( Or was promoting FIVE YEARS AGO) rather than the older Three Layer Model.
Cisco has seen better times but I am sure there misfortune of late has more to do with the fact that we got tired of buying their overpriced kits that offer worst performance than their competitors than with network design models.
Are “Flatter” Networks Simply Rearranging the Deck Chairs?
Timothy, great article. You really hone in on the fundamental problem in data center networking today. The entire model is broken and it’s all too often glossed over because of how much the large incumbents have riding on it. The problem has been masked with solutions that continue making the infrastructure more and more complex and less responsive to current application needs. More protocols and abstractions don’t fix the problem; they just put off the real pain for another day. Until we start with a clean slate and understand that we can't build sensible networks in an OSI stack vacuum, we're just re-arranging the proverbial deck chairs. Here’s my take: http://www.plexxi.com/index.php?option=com_content&view=article&id=42:flatter-networks-&catid=14:blog&Itemid=27