Original URL: https://www.theregister.com/2012/03/27/arista_networks_fpga_switch/

Arista juices switch with x86 server, FPGA, atomic clock

Not a God box – but close

By Timothy Prickett Morgan

Posted in Networks, 27th March 2012 18:20 GMT

Upstart switch-maker Arista Networks, founded by serial entrepreneur Andy Bechtolsheim, is at it again, mashing up new kinds of iron to tackle problems in the data center. This time, Arista is bundling an atomic clock, a baby x86 server, flash memory, and field programmable gate arrays into its Ethernet switches to create what it is calling an "application switch".

This is not a network acceleration appliance in the sense of the kind that have been built by Cisco Systems, Citrix Systems, F5 Networks, Riverbed Technology, and others, proclaims Arista marketing veep Doug Gourlay. Those boxes, he tells El Reg, tend to be modified x86 servers with a new label slapped on the front of them that actually do not participate in the switching or routing of network traffic themselves, but are rather static devices that do something to traffic to accelerate an application in some fashion.

Arista thinks that a select number of customers – those who have very demanding applications of a very precise nature and very specific network bandwidth and latency needs – will find its new 7124FX application switch to be just what they're looking for.

"But what we, Arista, is not building is as important as what we are building," Gourlay says. "This is not a God box. The 7124FX will not solve all of your problems and it is not for everybody."

But the 7124FX certainly does sound like it is something that stock exchanges, governments, supercomputing, medical, and telecom clients will consider for specific workloads.

Arista's 7124FX FPGA switch

Arista's 7124FX FPGA-accelerated switch (click to enlarge)

The 7124FX design starts with the 7124SX 10 Gigabit Ethernet switch that Arista launched this time last year. This 24-port switch is based on the Fulcrum "Bali" ASIC from Intel, which has fewer ports but less jitter than the Broadcom Trident+ ASIC that Arista uses in other switches.

This particular Bali ASIC is designed to deliver 500-nanosecond latency on port-to-port hops, no matter what size packet is being pushed through this Layer 2 and 3 switch. The 7124SX has 480Gb/sec of bandwidth and can handle 360 million packets per second, and also has custom extensions that Arista has made with Intel to link to the PHYs in the switch in order to lower latency.

Arista has crammed a few extra goodies into the 7124FX application switch, a funky derivative of the 7124SX. There's a server coprocessor with a dual-core Turion Neo X2 processor from AMD, for one, which has 4GB of main memory and 4GB of flash memory.

The 7124FX also has a 50GB flash-based SSD tucked inside of it for persistent storage for that x86 coprocessor. What makes it an FX switch, though, is the Altera Stratix V field programmable gate array, which has 6.2 million gates and which can be programmed to emulate various kinds of hardware or to run very specific algorithms right in the network flow.

That FPGA has 50MB of onboard SRAM memory, another 216MB of QDRII SRAM, and 8GB of DDR3 main memory all for itself; it links to the x86 coprocessor through a PCI-Express 2.0 slot.

The sixteen SPF/SPF+ ports on the right of the 7124FX switch feed into the switch ASIC, and the eight ports on the right feed into the FPGA and then into the ASIC. The assumption is that all ports will not need inline acceleration running FPGA algorithms and applications.

On the upper right hand side of the 7124FX you'll see a micro coax port, which is the input for an atomic clock if you want to synchronize the transactions or packets going through the banks of 7124FX switches using a rubidium atomic clock that has less than 200 picoseconds of drift per year.

What on earth does a switch need an atomic clock for – besides just being cool? Stock exchanges have to be fast, and that means everybody gets a 300 foot piece of fiber optic cable with which they connect into the exchange systems. Stock exchanges also have to be fair, which means they have to process transactions in the order that they hit the matching engines.

Devious moneymen? Who knew!

But there is a problem, explains Gourlay, and that is that the high frequency traders have learned to game the system. You see, if you learn how the stock exchange's network of matching engines is laid out, you can flood one matching engine with cancellations, thus slowing down that matching engine and any competitor's transactions that were directed to it, while at the same time redirecting your trades to a different matching engine, where they pass through like a photon turd moving through a glass goose. This is how you create a flash crash on Wall Street if everyone is doing it at the same time.

But what if, suggests Gourlay, you use a set of synchronized atomic clocks to timestamp all of the packets and therefore all of the transactions coming into the exchange at the switch port level? This does two things. First, you know how to rank who goes first because every bit of data has a timestamp, and you can get out of the business of using GPS systems to try to correlate the timing of transactions.

Also, synchronized atomic clocks wired into the switches allow for the stock exchanges and their trading customers to put their iron anywhere they want instead of cramming it into big wonking data centers in New York, London, Frankfurt, Paris, and Tokyo. Such clocks may make it cheaper to do automated, high-frequency trading, and the exchanges make it up in volume, so you know this sounds appealing to them.

While people don't talk about it much, FPGAs are used all over the place in systems today, including financial systems and various kinds of radar and signal processing systems used by the military and transportation agencies.

The trouble with FPGAs, however, is that they are difficult to productize and put into the workflow, which is one of the reasons why Solarflare, a maker of high-end 10GbE network interface cards for servers, soldered an FPGA onto its network cards and is helping customers to in-line preprocessing, processing, and postprocessing of data as it comes into and exits servers.

Solarflare is calling this the Application Onload Engine, and for freaky trading apps the company has been able to show a factor of 3X boost in performance by moving some algorithms to the FPGAs instead of running them on the server's CPUs. Solarflare is trying to foster developers to write FPGA code that will run on their NICs, and is starting in the financial-services racket where it is best known.

Arista, which is also well known on the freaky trading scene, is similarly expecting to build a community of software vendors who will create turnkey apps or help customers port applications from the CPUs to the FPGAs to accelerate them. Early adopters of the 7124FX will code their apps themselves, says Gourlay, since many of them already have FPGA experience.

Arista expects for inline risk analysis and algorithmic trading to be popular routines to run on the FPGAs in conjunction with the embedded x86 server inside of the 7124FX switch, and in some cases, companies may not even need an outboard server at all to run their applications. (Crazy, isn't it?)

Feed handling, real-time data analysis, order execution routing, and order-protocol conversion are also possible apps that can be moved from x86 servers in the racks to the FPGA-x86 server inside of the switch. Arista is working with Impulse C, which has tools that converts C programs to the FPGA's RTL language. Enyx is also signing up to build custom trading solutions, and Novasparks and Exegy are planning to build turnkey appliances based on the 7124FX and aimed at the financial-services sector.

The regular 7124SX switch costs $13,000, but getting the full-on AppSwitch 7124FX switch with the x86 server, the FPGA, and the 50GB SSD will run you $49,995. Adding the atomic clock will cost you another $10,000 on top of that. The AppSwitch 7124FX will be available sometime in the second quarter. ®