Cavium snubs MIPS, picks 64-bit ARM for next-gen network SoCs
Designers of brains inside your networking gear wheel out the 24-core Octeon-TX
Cavium – the brains behind the chips in big-brand networking products – has plumped for the ARM architecture over MIPS in its next-generation network processors.
It's a sign that more and more serious networking gear is likely to be ARM powered rather than MIPS in future.
The Octeon-TX family of system-on-chips, announced today, will use up to 24 64-bit ARMv8 cores taken from Cavium's ThunderX range of server-grade CPUs. Previous Octeon SoCs have used 64-bit MIPS cores.
Founded in 2001, San Jose-based Cavium is a fabless semiconductor designer that produces chips for the likes of Cisco, F5, Aruba, Netgear, Nokia Siemens, Juniper, Samsung, LG and others. Its Octeon SoCs turn up in cellphone base stations, and edge and core switches and routers, where MIPS is a traditional architecture.
With the Octeon-TX, Cavium has yanked MIPS out of Octeon, kept the underlying networking-focused silicon in place, and locked in its server-class ARM ThunderX cores. It's aiming these chips at storage and data center products, switches, industrial and embedded control systems, security boxes, and virtualized network appliances.
The MIPS architecture hasn't been completely thrown under the bus. A spokesman for Cavium told El Reg that the MIPS-based Octeon system-on-chips are still in production: "The entire MIPS based OCTEON III product family that is in the same technology node, is in production. We are extending the line to include ARM-based SoCs."
Judging from these Cavium slides, though, the future of Octeon lies in the ability to run established Linux distributions and open-source stacks – and right now, Cavium feels ARM is best suited for those jobs.
As Cavium puts it, the Octeon-TX uses ARM to take advantage of the "rich software ecosystem, extended support of open source applications and virtualization features of the ThunderX family of server processors." Basically, it didn't want to do that on MIPS.
Sure, there is plenty of tried-and-test proprietary MIPS software out there for running base stations and core routers.
However, Cavium's on to the fact that more and more enterprises and data centers want to run a mix of closed and open-source software on their networking kit, and ARM is seen as a better bet than MIPS in this area. While there's nothing wrong with the MIPS architecture, ARM has a shedload of momentum behind it right now, plenty of engineers are aware of it, and lots of software is ported to it.
"Once upon a time you just couldn’t take an embedded processor and run, say, Python on it. It was a problem," Venkat Sundaresan, director of product line marketing for Cavium's Infrastructure Processor Group, told The Register. "With modern ARM cores, all of this is now available."
(Yeah, we know you can run Python on modern MIPS processors too, but Sundaresan's point was that, in his view, ARM is king at the moment.)
There's also the fact that Cavium wants to bring the server-grade hypervisor features of the ThunderX line into the Octeon mix; modern MIPS has virtualization support in hardware, too, but Cavium went with its homegrown ARM-flavored tech anyway.
Here's the all-important product roadmap with speeds and feeds:
Samples of Octeon-TX components with up to four 2GHz 64-bit ARMv8 ThunderX cores will be sent out this month to Cavium's customers. These SoCs feature up to 2MB of level-two cache, two integrated 10GbE or eight 1GbE interfaces, two SATA 3.0 ports, PCIe gen-three interfaces, and cryptography accelerators.
Full-fat Octeon-TXs SoCs with up to 2.2GHz 24 cores, 12 10GbE or three 40GbE interfaces, PCIe 3 connectivity and SATA ports, and crypto accelerators handling up to 40Gbps of traffic are due to land in the third quarter of the year. The design can scale all the way up to 96 cores, we're told.
The accelerators support up to 8,192-bit keys, let the chips stream up to 40Gbps of encrypted tunnels, and the accelerators' block ciphers are programmable: you can update the algorithms as needed. The high-speed encryption is seen as essential for enterprises streaming sensitive data to and from the cloud. The chips can also perform inline storage encryption, and SSL encryption, "at line rate."
Each Octeon-TX SoC has Cavium's ￼Nitrox V security coprocessor built in, which accelerates cryptographic, decompression and compression algorithms in hardware, thus taking the strain off the main CPU cores.
The idea behind the large core count is to dedicate groups of them to particular tasks. So, for example, four cores could run a real-time operating system, such as Intel's VxWorks, to juggle data plane workloads, such as handling traffic for the SATA controllers. Another bunch of cores could run a different operating system, say Linux, to handle control plane functions.
Each core can check a work queue for things to do, and pick up a job when something needs to be done, with memory pointers passed between cores so they can find the data they need.
Sundaresan told us the Octeon-TX family offers "35 per cent more cores for the same price point with a higher bandwidth" than comparable Intel processors. Intel is touting Broadwell Xeon E5 v4 chips with up to 22 cores that are aimed at, among various things, virtualized networking boxes.
Each Octeon-TX SoC also has a hardware packet buffer manager that can automatically strip off the OSI layer one and two headers from packets, extract the layer three data and queue it up for a core to pick up. The programmable packet processing units can crunch up to 50 million packets per second. The SoCs can also provide virtual network interfaces in hardware for software stacks that need those.
"Millions of flows can come in, and these are extracted fast. Using the buffer management, we line up the packets into queues and assign those queues to cores," said Sundaresan.
Essentially, Cavium thinks its ARM-based next-gen SoCs provide better performance-per-dollar and performance-per-watt figures than the competition, by offering a dense mix of lower-power cores and tightly integrated interface controllers.
The high core count lets programmers assign cores to specific tasks, reducing the number of virtual machines running per core or group of cores. This lowers the hypervisor overhead, reduces interrupt latency, and gives Cavium something to talk about to catch the attention of networking vendors as they shop around for chips, potentially lured by the energy Intel is throwing at virtualized network functions. ®