AMD's first 64-bit ARM cores star in ... Heatless in Seattle*

Original URL: https://www.theregister.com/2014/08/11/amd_seattle_64_bit_arm/

AMD's first 64-bit ARM cores star in ... *Heatless in Seattle**

* Relatively speaking – this SoC tries to be low-power, data-center-grade

Posted in HPC, 11th August 2014 16:01 GMT

Hot Chips 26 AMD today sheds more light on its "Seattle" 64-bit ARM architecture processor at the Hot Chips conference in Cupertino, California.

Take one glance at this new Opteron A1100-series system-on-chip, and you'll realize it's aimed squarely at servers rather than the traditional ARM scene of handheld gadgets and embedded computing – although that was to be expected: AMD CEO Rory Read said as much in April.

As expected, Seattle has eight Cortex-A57 cores – ARM's top-end design running 64-bit ARMv8-A code – and will be fabricated using a 28nm process. The cores will run at 2GHz or more.

The octo-core Seattle SoC will have 4MB of level-two cache and 8MB of level-three cache; two 64-bit DDR3/4 channels with ECC and two DIMMs per channel running up to 1866MHz, supporting up to 128GB of RAM per chip; and controllers for eight 6Gbps SATA3 ports, two 10Gbit Ethernet ports and eight lanes of generation-three PCIe.

Seattle also uses ARM's System Memory Management Unit (SMMU) to link the aforementioned interfaces to the A57 cores. The S in SMMU should really stand for Super or Steroids, because the SMMU does more than the usual address translation and access protection: it allows hypervisors to define per-guest OS translation tables, keeping guests in separate pools of physical RAM. The SMMU design has been kicking around for a few years now [PDF] but its use in virtualization is especially relevant to this server-grade SoC.

And if you like your SoCs, AMD has put a SoC within a SoC: a system control processor (SCP) packing a little Cortex-A5 core with 64KB of ROM; 512KB of SRAM; timers and a watchdog; the usual SPI, UART and I²C interfaces; TrustZone execution space; and a 1Gbps Ethernet remote management port (RGMII).

The idea of the SCP is to boot up, configure and monitor the main processor while maintaining its own (in theory) secure space to execute code. If the system running on the main processor falls over, or otherwise needs to be restarted from scratch, the SCP is needed to be on hand (and non-compromised) to power cycle the machine or similar. The TrustZone component of the SCP is supposed to guarantee this by ensuring the system boots from a known good, secure state each time.

Seattle is not unique in having one of these sidekick CPUs to keep it on the straight and narrow – far from it – but it's worth noting its presence.

The computer within the computer ... Your Seattle chip actually includes two systems, one sorta hidden

The SCP follows UEFI 2.4, in that it starts first when the machine is powered up, initializes the main SoC, starts its own real-time operating system, and then releases the A57 boot core from reset to start the UEFI ARM firmware.

The OS running under your hypervisor running under your OS ... How Seattle is booted by the SCP

This sidekick processor also includes a coprocessor for accelerating cryptographic algorithms, which is attached to the SCP or via an interconnect to the SMMU. This coprocessor includes a random number generator, and can perform zlib compression and decompression in hardware along with AES, Elliptic Curve Cryptography, RSA, and SHA algorithms.

Knock your server SoCs off ... the system-on-chip's features (click to enlarge)

Why switch x86-64 for ARMv8-A?

But why use ARM-compatible CPUs as the brains of the data center, you may be thinking. The ARM architecture is already in server warehouses – embedded within the controllers in your hard disks, for example. But now the architecture, famous for being low-power and low-complexity and thus ideal for battery-powered things, is leaping into compute. And it leaves people scratching their heads.

That ARM had to go 64-bit to enter the server space is obvious: it opens up 64-bit-wide virtual addresses to software, and it gave the British core designers the opportunity to come up with a clean new instruction set that resembles MIPS64. It also gives a little more headroom with physical memory, moving the architecture from 40-bit physical addresses (max 1TB) to 48-bit (256TB).

In touting Seattle, AMD argues that a lot of data center lifting – think front-end web servers – is unsuited to watt-gobbling, bogglingly complex x86-64 processors, and thus the job ought to be passed to chips that are smaller (so more can be crammed into racks) and less power hungry (always the USP of the ARM family).

Seattle measures 27mm x 27mm and is said to have a TDP of about 25 watts. The x86-64 eight-core 2GHz Intel Xeon E7-4820 v2, for example, is 52mm x 45mm and has a TDP of 105 watts, although we'll admit it's not a totally fair comparison.

Perhaps a fairer one would be the Intel Atom Processor C2758: eight x86-64 cores, 2.4GHz clock speed, 34mm x 28mm package using a 22nm process, and a TDP of 20W – revealing Intel's reaction to potential competition in the low-power data center market: drastically slimming down its x86 iron.

Server software handling large numbers of concurrent requests can end up touching a large range of data, causing a high rate of costly data cache misses [Bhatia et al, 2006]. Thus, AMD argues, you may as well use CPUs with smaller caches and lower complexity, which gives you lower power consumption and higher density.

"Seattle is a dense server processor for data center applications. Performance per dollar per watt drives today’s data center designs," AMD's Sean White will tell Hot Chips from 5.30PM today, California time.

"A significant number of data center workloads have inherently low Instructions Per Clock (IPC) and high cache miss rates. For such workloads, processors like Seattle, with smaller cores and caches, can deliver the equivalent performance as traditional server processors with large cores and caches, but using much less power and area."

Building software and hardware for Seattle

AMD will also show off its Seattle reference system that doubles as a $2,999 development kit: a 2U rack-mount box with one PCIe gen-3 x8 slot or two x4 slots, ports for up to eight hard drives, a microATX motherboard with a Seattle SoC, two 10Gbit Ethernet ports, four I²C interfaces, two serial ports, and 64-bit ARM Linux in the form of a Fedora distro, ARMv8-A versions of Java 7 and 8, and the usual GCC toolchain.

Hot Chips ... the Seattle reference board

AMD is, of course, not alone in churning out 64-bit ARM-flavored chippery, although Seattle is its first. Allwinner and Samsung are aiming theirs at fondleslabs, handsets and the like; processor core architect ARM itself is touting the Juno board to developers; and Apple has been packing ARMv8-A compatible SoCs in its iPhones. Upstart Calxeda sadly couldn't bring its 64-bit ARM cores to market soon enough, causing it to run out money.

The Seattle silicon, fabricated by GlobalFoundries, is due to ship in the fourth quarter of 2014. ®

Why switch x86-64 for ARMv8-A?

Building software and hardware for Seattle

Related stories

AMD's first 64-bit ARM cores star in ... *Heatless in Seattle**