Convergence as a new new thing
The only way performance can go is up, says Dave Cartwright
Feature Nearly 20 years ago I was technical editor of a weekly networking and telecoms newspaper. In those days the big word was “convergence” – at that time in the context of telephony and data coming together into a single network infrastructure and protocol set. Here we are in 2014, and that word is once again being bandied about – this time in the much larger context of the entire set of components of the technology infrastructure.
The changing face of convergence
When we were talking about CTI convergence in the 1990s it was primarily about bringing telephony into a world where it communicated using the same networks (primarily Ethernet) and protocols (generally IP) as data systems, thus eliminating the need for complex gateways and expensive software that translated between two separate worlds. These days IP is ubiquitous and so there's a far lesser need for gateway devices and inter-protocol translation.
The problem is, though, that a standard such as Ethernet and IP is merely a lowest-common-denominator means of making A communicate with B. If you're basing your applications and systems on a technology because it gives them the ability to communicate, the chances are that at least some of the applications will perform less well with that technology than they did with their own proprietary protocols. Technology is a land of compromise, and it's usually the case that compatibility is achieved at the expense of performance. Convergence, then, is no longer a case of standardising on a chosen set of protocols.
In our 1990s example we were concerned with connecting two devices together and making them communicate with each other. Those devices were, however, pretty much stand-alone entities. You had a phone system (a box with some lights and ports) communicating with a server (another box with some lights and ports). Today, however, both of these devices are considerably more complex – they're no longer single devices but potentially complex stacks of technology.
Let's confine ourselves for the moment to the hardware side of things (albeit both virtual and physical) – we'll get to applications later. Taking a typical server in a virtualised infrastructure we have:
- An operating system running on a virtual server.
- A virtual server, running in a hypervisor such as Hyper-V or ESX.
- A virtual network switch, running on the same hypervisor.
- A set of physical network adaptors to which the virtual switches connect.
- A physical server hosting the network adaptors.
- A Storage Area Network (SAN) connecting the server to external storage media in a SAN – either iSCSI via the abovementioned LAN or Fibre Channel via its own switching fabric.
- A SAN controller front-ending a set of disk arrays.
- The disk arrays providing the storage.
- A LAN connecting the server to other devices in the organisation.
The path from OS to disk is, therefore, considerably longer than it was in the days before virtualisation. Every inter-layer interface introduces a delay of some sort – no matter how minuscule these are on their own, they can add up to something significant and unacceptable. A message from the top layer to the bottom layer travels through the interfaces between each of the pairs of layers. The first obvious reaction is to contemplate the idea of compressing two or more of the layers back into their old model, but this is generally unpalatable since they wouldn't have been split had there not been a good reason.
For example, SAN storage makes more sense than internal dedicated storage because it minimises unused local disk space and enables efficiencies through de-duplication and dynamic storage reallocation. Virtual servers bring similar efficiencies of resource sharing and power consumption when compared to physical servers.
Since we're not going to do away with this more-layers-than-before model, we have a problem – or perhaps we should call it an opportunity for efficiency. For the operating system on server A to talk to the operating system on server B it has to send messages down through the layers, across the LAN, and up through the layers to server B.
It's highly likely that we can't make the communication between the layers any more efficient, so perhaps we could make the process more efficient by short-cutting the communications somehow and cutting out one or more layers in some of our transactions.
Convergence – cutting out layers
Convergence in today's language is, therefore, enabling systems and applications to communicate without having to go through all of the layers in the system. There are two ways to do this: to move between two distant layers without passing through the ones in between; and to ignore big chunks of the model altogether. Let's illustrate these with a couple of examples.
First, imagine you want to back up one of your servers. With a physical server you'd install a backup agent on the OS of the candidate machine and your backup server would connect to it over the LAN and pull off the data. The agent software would do some funky stuff to deal with backing up files that were open/locked – the databases on a SQL server machine, for instance.
In a virtual world we can ignore the candidate server entirely – the backup server simply tells the underlying host to make a snapshot of the candidate server, then dips directly into the SAN and pulls off the files from the filestore. Assuming that the host has sufficient headroom on its CPU and RAM the candiate machine remains entirely unaffected, and the backup runs considerably faster than it otherwise would.
Now look at how a typical virtualisation system such as Hyper-V or ESX works with regard to virtual networking. If you have a pair of virtual servers on a particular host and they need to communicate with each other, they do so via the hypervisor's on-board virtual switch: the traffic doesn't ever even hit the LAN switch underneath. By cutting out a number of layers, not only is communication made more efficient between servers, but the overall load on the physical LAN switch is lightened and freed up for the traffic that has to go through it.
Convergence – multiple inter-layer interfaces
Layered models work by abstracting lower layers through well-defined interfaces so that higher layers don't have to know the intricacies of how they work. If there were no OSI seven-layer model, for instance, every application would have to contain code that enabled it to interact directly with the LAN adaptor in the host machine (which could be one of hundreds of different adaptors). As it is, application writers simply need to know how to work with half a dozen “socket” commands on a standard Application Programmer Interface (API) into the TCP layer.
In a modern virtualised infrastructure the model remains: higher layers use lower layers' APIs in order to exploit their functionality. In addition the range of complexity being hidden by these APIs is considerably smaller than it used to be (the days of running IP, IPX, DECnet and AppleTalk in parallel over a smorgasbord of Ethernet, Token Ring and X.25 are long gone). It's therefore generally perfectly feasible to let layer A communicate directly with layer D using the latter's API directly without touching layers B or C: the only prerequisite is that the two entities can physically communicate.
There's no reason, then, why (say) a phone system can't make direct iSCSI calls to a SAN controller. Or why our aforementioned backup server can't pull files natively off the SAN and send them to tape.
Adding in the applications
There's also no reason why applications can't then do exactly the same thing. Back in the day some of the big database software vendors wrote their own disk subsystem drivers for some of the platforms they ran on because it gave a distinct speed improvement over the server vendors' own (unsurprisingly – server vendors wrote code that gave the best average-case performance across numerous applications, whereas database vendors wrote for best performance in a single scenario). This approach was ridiculously difficult but worthwhile in that particular niche market. Today, though, it's trivial for application developers to interface with low-level components of the infrastructure.
The final step
We've talked about how to make distant layers in our virtualisation model talk directly to each other. The next step is to enable them to reconfigure each other. And if this sounds a little mad, it's perfectly possible right now using protocols that are in the public domain and widely supported.
Our virtual machine backup example relies on the fact that the backup server can use (say) VMware's API not just to access data but actually to change the system – in this case to create and then later delete a virtual server snapshot. We can, however, enable applications and the virtual infrastructure core to make far more spectacular changes – not least to the underlying network. Protocols such as OpenFlow are supported by all the big network equipment vendors; this allows the network core's configuration to be updated actively by external entities, and it can thus be used by any of the entities in the stack to make changes that will make the overall infrastructure work more efficiently.
And this is the point of convergence in 2014. It's no longer a case of making disparate systems talk to each other: that was achieved when the world saw the light and standardised on IP networking. Today's convergence is all about taking a multi-layer infrastructure that exists for perfectly good reasons and:
- Acknowledging that the layers all have to exist, and that we're unlikely to make any of the individual inter-layer boundaries any more efficient.
- Taking and maximising its efficiency by breaking down the barriers between the layers and allowing them to communicate other than with their neighbouring layers.
- Enabling all the layers, from the top level applications down to the physical network cables and disks, to influence each other and collaborate to reconfigure the overall infrastructure dynamically for improved performance.
By converging systems in this way, the only way performance can go is up. And by allowing layers to interoperate and even reconfigure each other we're making the system administration task considerably easier both by increasing the potential for systems to tune themselves automatically as well as vastly reducing the number of individual components that have to be manipulated by hand. ®