Dell tunes up servers for high freaky traders
Making a PowerEdge sit up and shark on Wall Street
When you run a high-frequency trading operation, milliseconds are millions of dollars lost or gained. So speed is just as important as the algorithm you come up with to make your trades. That's why supercomputer makers Appro (just eaten by Cray), Silicon Graphics, and Penguin Computing in 2010 launched special overclocked servers aimed at HFT shops, and IBM followed suit a year later. And for many HFT workloads, the number of cores doesn't matter nearly as much as a high and consistent clock speed.
And so both Intel and AMD offered server makers non-standard Xeon and Opteron parts to juice their iron. Intel, for example, offered the "Everest" Xeon X5698, a variant of the six-core "Westmere-EP" processor for two-socket servers that had four of the six cores turned off and its clock speed permanently goosed to 4.4GHz.
As it turns out, Dell was also selling custom HFT servers to hedge funds and other financial services companies based on this "Everest" Xeon chip, Brian Payne, executive director of PowerEdge marketing at Dell, tells El Reg. (Dell never said anything about it at the time, keeping it on the down-low.)
But the Everest chips were not ideal, and that's probably one reason why Dell did not say much about its HFT servers.
"The thermals are completely different with this Everest chip, and they required a server redesign" Payne told us. "Plus they had no turbo mode, had only a one-year warranty, and carried a significant price premium." How much so more the Everest Xeons cost compared to other low-core, high-clocking Xeons, Payne is not at liberty to say.
But what Payne can say is that the fact that the Everest chips had only a one-year warranty from Intel and were too expensive compared to the performance boost – a "non-starter" for customers. "It was a good learning experience for us," says Payne.
And so Dell's techies went back to the drawing board and tried to come up with another way to juice the performance of its PowerEdge servers. They came up with what the company calls Processor Acceleration Technology.
With PAT, Dell goes into the BIOS of its standard PowerEdge R620, R720, and R720xd servers, and tweaks it so it tells the Xeon processors in the machines to run at a fixed Turbo Boost speed based on the number of cores you want to run on a Xeon E5 chip. The fewer cores you activate, the higher the clock speed you can peg the processor at.
The neat trick is that the BIOS tweaks set by Dell can be mixed with a setting in Linux that will not allow the Turbo Boost of a core to slow down if there is no work. Under normal circumstances, the power control unit (PCU) checks the status of each Xeon core every millisecond and lowers the frequency if there is no real work going on.
This check takes 1 microsecond, which may not seem like a lot, but apparently with HFT applications, a 0.1 per cent impact is a big deal even on such a short time scale, not so much because of the time, but because the cores get all jittery and non-deterministic.
So Dell has cooked up some Linux and BIOS settings that will keep the cores revving at the top Turbo Boost speed as if this was their core frequency – even though it is not.
While you may not know it, a Xeon E5 processor can go above its official Turbo Core frequencies for a short period of time if its thermals allow it, and the PAT BIOS and Linux settings stop this from happening as well. HFT servers like no jitter on their servers and their networks, so that everything runs the same all the time. So PAT basically makes a PowerEdge server sing at a higher note and hold it, not wavering higher or lower.
Payne says that the PAT settings are best combined with the top-bin E5-2690 processor, which clocks at 2.9GHz. With only one core activated, you can push it up to 3.8GHz and hold it, and with two cores you can hold it at 3.6GHz, and with four you can hold at 3.4GHz. That is obviously not as high as the 4.4GHz frequency of the special Everest part, but this approach gives you a significant boost at no extra cost and with a full three-year warranty.
If you need six or eight cores of oomph, you can clock the Dell boxes up to 3.3GHz using PAT. No matter what, you stay within the Xeon E5-2690's 135 watt power envelope. This is a lot better than the eight-core E5-2687W, which is aimed at workstations and which has a 150 watt power envelope. This one has a base 3.1GHz clock speed across all cores and a Turbo Core max of 3.8GHz.
The natural comparison to make for HFT servers is a four-core Xeon E5-2643, which runs at 3.3GHz and turbos (somewhat wobbly) at 3.5GHz. On a variety of benchmarks, the PAT-enabled E5-2690 provides anywhere from around 5 to 25 percent better performance. The average seems to be around 10 per cent for real trading applications that have been benchmarked, says Payne.
The important thing is not the performance boost, but rather that this is a freebie BIOS patch that any PowerEdge customer can download and use with specific Linux settings. You can do this on your own PowerEdgies if you have a hardware support contract and you are using the top-bin Xeon E5-2690 part.
There's no technical reason why the PAT settings won't work on other Xeons – or even Opterons – but for now, the E5-2690 is all that HFT customers are asking to goose. ®