Bandwidth restrictions can affect the memory
Don’t let aging systems slow you down
Every now and again, a conversation at the pub goes somewhere interesting.
One of the more junior sysadmins in our group recently took over an aging small business network. The company is absolutely dependent upon an archaic piece of software for which virtualisation was the only available route to increasing the application’s speed and responsiveness.
The problem is that this is always a trap.
Once you’ve managed to virtualise the application, thus bringing the speed benefits of modern hardware into play, the powers that be tend not to consider expensive options such as upgrading to an application written during this century.
To make matters even more interesting, our valiant hero has been tasked with doubling the speed of the application while expanding it to serve twice as many users . And he can’t replace any of the virtual servers.
He did, however, recently get approval for a network and storage upgrade.
So the virtual servers from early 2007 running the application from 1999 on the database from 2000 got brand new 10 Gigabit Ethernet (10GBE) setups and a hybrid flash/disk SAN from 2011.
All the bits communicated and there was some marginal improvement in speed. But most of this beautiful new dual 10GBE network upgrade was sitting idle. The database system just wouldn’t fully utilise what should be a screaming storage system.
The symptoms looked eerily similar to a difficulty another pair of sysadmins were having with a white-boxed Hadoop setup they were building. They simply could not get their Namenode to talk to the rest of the system faster than about 14Gbit/s, despite having a 4-up 10GBE card teamed.
The conversation turned to technical speculation and a couple of pints later we had diagnosed the problem: both groups were trying to use old Opteron 2000 series servers to get up to – or past – the 20Gbit/s barrier, with the second group even trying to hit 40Gbit/s.
Problem 1: those old boards use PCIe 1.x. That means that even the x16 links only go 32Gbit/s in each direction. A 4up 10GBE card is pretty pointless here.
Problem 2: even if you could find enough PCIe 1.0 x16 links on these servers to deliver 40Gbit/s of data into the system’s RAM, the maximum real-world output for a dual-channel DDR2 server of that era is about 60Gbit/s. Quite apart from the PCI-E bus issue, there is no possible way to do work effectively on that much raw input.
Prepare to jump
Even modern systems are facing such constraints. My beloved Intel quad-port 10GBE cards aren’t going to give you a simultaneous 40Gbit/s in a single direction. They are PCIe 2.0 x16 cards, and with PCIe 2.0 x16 limited to 32Gbit/s in a single direction, you simply cannot flatten all links on those cards at the same time.
The next generation offers more of the same: PCIe 3.0 delivers 126Gbits/s per x16 link. Flanked by emerging 40GBE and 100GBE adapters, this sounds great – until you run into the memory bandwidth bottleneck yet again. Your average quad channel DDR3 server will only give you about 333Gbits/s.
The moral of the story is that the sum of your system design can easily be less than the sum of its parts. If you need to make the jump to 10GBE – and many of us do – don’t try tacking it onto aging hardware.
Even if you don’t need the CPU speed increase a server chipset refresh brings, you will probably need it if you want to increase your I/O. ®
This article has been correct to give the correct speed for PCIe 1.x: 16 lanes at 2GBit/s per direction gives a total of 32GBit/s in each direction.