Microsoft boffins: Who needs Intel CPUs when you've got FPGAs?
Bing searches get speed boost from Catapult integration
Microsoft hooks up reprogrammable chips directly to its data centers' internal networks to ramp up the performance of its web applications.
The Windows giant is so impressed by the tech, it reckons the customizable hardware could eventually take on more computational work than the Intel workhorse processors that today fill its servers. This comes as Google signals its intention to run non-Intel CPUs in its data centers.
In a research paper titled A Cloud-Scale Acceleration Architecture, Microsoft describes how mainlining field-programmable gate arrays straight into its network boosts the performance of Bing search query returns and other services. The paper appeared online today, and will be published by the IEEE Computer Society's 49th symposium on microarchitectures, which kicked off on Saturday.
FPGAs are reprogrammable chips that are chock full of decision-making circuits that can be linked together to form an application in silicon: think of the arrays as a box of Lego bricks that you can assemble as needed to build a new thing. Rather than create toy space stations and pirate ships, though, you're crafting specialized hardware that can crunch data at high speed, ideally far faster than software running on a generic processor.
Microsoft has been exploring the use of FPGAs since 2010 with its Project Catapult effort. Last month, it went public with its use of the Intel Altera FPGA chips in its Azure cloud.
The logic arrays sit on a daughter board within their host servers together with their own RAM. The boards are connected to the host CPUs by a PCIe gen-3 interface. They are also wired directly into a nearby switch using a 40Gbps QSFP link and pass packets to and from the network to the host server's traditional NIC. Thus, the FPGA acts an accelerator, directly manipulating data as it flows in and out of the machine.
Microsoft's design essentially moves the FPGA into the direct path of incoming requests and outgoing data, which saves having to shunt data from the NIC to the FPGA and back again over an internal system bus. The arrays can handle tasks all by themselves in silicon, or pass information to the host Intel x86 processors to operate on if needed and potentially perform additional manipulations on the data while in transit.
For example, the FPGAs can encrypt and decrypt data on the fly before it hits the app running on the box.
The result, says Microsoft, has been a decrease in latency for all of the cloud services using the design. "By enabling the FPGA to talk directly to the network switch, each FPGA can communicate directly with every other FPGA in the datacenter, over the network, without any CPU software," Microsoft researchers write in their paper.
"This flexibility enables ganging together groups of FPGAs into service pools."
By doing this with the Bing web search data centers, Microsoft researchers found that the servers were able to handle higher loads of search queries more quickly, and with the need for fewer machines. We're told that the FPGA-accelerated design has been "deployed at hyperscale in Microsoft’s production data centers worldwide."
The Redmond researchers believe that the integration of FPGAs can have similar results with other cloud computing platforms, whether they be web applications or local cloud tasks such as software-defined networking.
"With the Configurable Clouds design, reconfigurable logic becomes a first-class resource in the datacenter, and over time may even be running more computational work than the data center's CPUs," Microsoft concludes.
For what it's worth, Intel is working on x86 Xeon processors with builtin FPGAs. ®