TI fuels up KeyStone II ARM for HP Moonshot hyperscale servers

ARM/DSP hybrid presents data centers with interesting possibilities

Boost IT visibility and business value

Hewlett-Packard is putting more ARM server processor options into its next-generation of "Project Moonshot" hyperscale servers - the latest one coming from Texas Instruments, which has been relatively quiet on the server front but plenty active in the ARM chip market at large.

In a blog post, Tim Wesselman, senior director of ecosystem strategy for the HyperScale Business Unit at HP, said that TI has joined the Moonshot PathFinder partner program and would be figuring out how to put its KeyStone II variants of the ARM RISC processor into Moonshot boxes for "large-scale, concurrent real-time processing of cloud and traditional telecommunications workloads."

Like all of the exciting ARM server processors either out the door or in the works from Calxeda, Marvell, and Applied Micro Circuits, the KeyStone II chips are not just ARM processors tricked out to run server workloads, but they also include integrated networking that is used to lash multiple server nodes together into a fabric.

While Advanced Micro Devices has not announced its plans, it seems likely that a future Opteron-branded ARM processor will include on-chip network adapters and links to the SeaMicro "Freedom" fabric at the very least, if not a distributed switch architecture like Calxeda has cooked up. If not, AMD should just not bother.

The word on the street is that HP will use the TI KeyStone II chips in its second-generation "Gemini" Moonshot servers, which the company previewed last summer. During that preview last June, the only processor that HP talked about was Intel's "Centerton" Atom S1200 server chip, which was announced in December last year, and it never mentioned ARM processors, not even once. (Funny that.)

HP has not provided much in the way of feeds or speeds for the Gemini machines except that they will use the two-core Atom S1200 processor, which has 64-bit processing, supports VT virtualization assist, and ECC scrubbing on main memory and is certified to run server variants of Linux and Windows.

HP's Project Moonshot Gemini enclosure

HP's Project Moonshot Gemini enclosure

From this meager picture, it looks like the Gemini chassis is around 10 rack units high and will have two bays into which server "cartridges" will be loaded. That is the full extent of anyone's knowledge of Gemini machines from any public statements, and HP did not say, as some chatter suggests, that the TI KeyStone ARM processors will be used in the future Gemini chassis.

It has not said, either, how the Gemini machines will stack up compared to the "Redstone" Moonshot boxes that HP launched in November 2011 using the 32-bit Calxeda ECX-1000 ARM chips, which include an on-chip distributed Layer 2 switch that is very clever.

Because of Calxeda's long-time work with HP on Redstone and the fact that Calxeda's products are in the HP Discovery Labs now, Calxeda ARM chips will very likely be in the Gemini machines, but neither Calxeda nor HP would confirm this.

It is entirely possible that another Moonshot machine – perhaps Saturn or Apollo, depending on if HP is going to use the booster or the capsule name – is next and will be based on Open Compute's Group Hug microserver backplane and form factor and that this is where current and future ARM servers and maybe even future Atom, Xeon, and Opteron servers, will be used.

Those of us outside of the HP Discovery Lab's NDAs – or those given to potential customers for Moonshot boxes – just don't know. (And if you do, please, do tell.)

What El Reg can discuss is neat features of the KeyStone-II system-on-chip (SoC) designs. TI has cooked up plain-vanilla Cortex-A15 processors, which have two or four 32-bit cores with 40-bit memory addressing (known as Large Physical Address Extensions in the ARM world) as well as hybrid ARM processors that mix anywhere from one to four Cortex-A15 cores with from one to eight TMS320C66x digital signal processors into a single piece of silicon.

Block diagram of the KeyStone II system-on-chip from Texas Instruments

Block diagram of the KeyStone II system-on-chip from Texas Instruments

The interesting bit is that these ARM-DSP hybrids are using the same TMS320C66x DSP elements – and using the same TeraNet coherency network to lash them into an SoC – that TI was peddling as coprocessors for x86 iron back at the SC11 supercomputing event a little more than a year ago.

The architecture for the DSPs and the ARM chips is exactly the same and is known by the same KeyStone name, too. The difference now is that they can have ARM cores etched on them if you want, or if you don't want any DSPs at all and just ARM cores, then TI is fine with that, too.

With this approach, TI can go after pure cloud infrastructure workloads – servers, switches, routers, network control planes, industrial sensors, and wireless transport devices – with the plain vanilla ARM versions of the KeyStone II chips.

It can dial back the DSP count on the hybrid chips for workloads to go after video, IP camera, traffic system, voice gateway, and medical device applications, and dial up the DSP count on the hybrid ARM-DSP chips for heavier workloads like supercomputing, video conferencing, image processing and analytics, medical imaging, and even virtual desktop infrastructure.

Eight of those DSPs can offer around 1 teraflops of floating point performance at single precision and around 384 gigaflops at double-precision, and the next-generation DSPs from TI are expected to do a lot better.

The thing to watch is the performance per watt. The plan was to be able to do 2 teraflops at 200 watts, and toss in a few ARM chips and you have a pretty interesting module for a ceepie-deepie supercomputer.

The DSPs run at up to 1.2GHz and have 1MB of their own SRAM Level 2 cache, and the two or four Cortex-A15 ARM cores share a 4MB L2 cache with each core having 32KB of L1 instruction and 32KB of L1 data cache. The ARM cores run at up to 1,4GHz and have ECC scrubbing on all of their caches, which is important for server workloads; the DSPs have soft error protection only.

The KeyStone II family of ARM Cortex-A15 processors

The KeyStone II family of ARM Cortex-A15 processors

What is also important is that the KeyStone II processors have an integrated Ethernet switch right there on the chip. Presumably this switch will be able to link SoCs together in a switched fabric as Calxeda has done with its processors.

But it may not have as much oomph, since it is, according to the specs, only a five-port Gigabit Ethernet switch; one port faces the computing elements and four ports face out of the SoC to the outside world.

Hopefully, it is possible in software to create a flat Layer 2 fabric out of multiple SoCs and their inherent Gigabit Ethernet switches to make minimalist, dense-pack clusters. That network accelerator on the KeyStone II chip runs at 1Gb/sec wire speed and can handle 1.5 million packets per second of throughput, which could be also very useful for lots of cloudy and hyperscale workloads, too. ®

The essential guide to IT transformation

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Oracle reveals 32-core, 10 BEEELLION-transistor SPARC M7
New chip scales to 1024 cores, 8192 threads 64 TB RAM, at speeds over 3.6GHz
Docker kicks KVM's butt in IBM tests
Big Blue finds containers are speedy, but may not have much room to improve
US regulators OK sale of IBM's x86 server biz to Lenovo
Now all that remains is for gov't offices to ban the boxes
Flash could be CHEAPER than SAS DISK? Come off it, NetApp
Stats analysis reckons we'll hit that point in just three years
Nimble's latest mutants GORGE themselves on unlucky forerunners
Crossing Sandy Bridges without stopping for breath
prev story


5 things you didn’t know about cloud backup
IT departments are embracing cloud backup, but there’s a lot you need to know before choosing a service provider. Learn all the critical things you need to know.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.
Rethinking backup and recovery in the modern data center
Combining intelligence, operational analytics, and automation to enable efficient, data-driven IT organizations using the HP ABR approach.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.