Feeds

Penguin Computing muscles into the ARM server fray

Aiming Cortex-A9 clusters at Big Data

7 Elements of Radically Simple OS Migration

Linux cluster supplier Penguin Computing is diving into the low-power ARM microserver racket and has tapped server chip upstart Calxeda – which has just rolled out its multiyear product roadmap for its EnergyCore processors – as its chip and interconnect supplier for its first boxes.

The new machine, called the Ultimate Data X1, is based on the twelve-slot SP12 backplane board created by Calxeda, just like the Viridis server from UK server-maker Boston. The experimental "Redstone" development server from Hewlett-Packard also uses the SP12 backplane board, putting the three of them on a full-depth SL6500 server tray and four trays in a 4U chassis for a total of 72 server nodes in a 4U space.

The Calxeda EnergyCard system board puts four quad-core EnergyCore ECX-1000 processors onto a board, plus memory slots and SATA ports with two PCI-Express connectors to link each pair of sockets into the backplane. Each processor is based on for Cortex-A9 cores, which support 32-bit memory addressing and therefore tap out at 4GB of main memory in the single DDR3 slot allocated to each processor socket. Each socket has four SATA ports as well for peripherals.

The interesting bit about the EnergyCore chip is that it includes a distributed L2 switch, which can be used to hook up to 4,096 sockets into a flat cluster using a variety of network configurations, including mesh, fat tree, butterfly tree, and 2D torus interconnections of system nodes. The first generation fabric switch, which has been rebranded the Fleet Services Fabric Switch as part of the expanded Calxeda roadmap, is an 8x8 crossbar with 80Gb/sec of bandwidth, and it links out to five 10Gb/sec XAUI ports and six 1Gb/sec SGMII ports that are multiplexed with the XAUI ports.

There are three 10Gb/sec channels that come out of each EnergyCore chip that are used to link to the three adjacent sockets on the system board, so they can share data very quickly. The five other ports are used to link sockets on other EnergyCard server boards to each other. Latencies between server nodes vary depending on the network configuration and the number of hops it takes to jump from socket to socket across the cards and backplanes, but working through the Fleet Services distributed network, you can do a node-to-node hop in about 200 nanoseconds, according to Calxeda.

That's better than the performance of most low latency 10 Gigabit Ethernet top-of-rackers aimed at high freaky trading. If that interconnect could do cache coherency across all (or even a large portion) of those 1,024 EnergyCard nodes, we'd be calling this the God Box.

Boston's Viridis machine puts one SP12 backplane card, a dozen EnergyCards, and two dozen 2.5-inch drives into a 2U chassis. But Penguin Computing, seeking to use cheaper and fatter 3.5-inch SATA drives, has opted for a 4U chassis for the UDX1 that houses three dozen 2TB fatties as well as a single SP12 backplane board and a dozen EnergyCards. There are 24 drives across the front of the unit and another 12 drives buried inside the unit, giving a higher disk-to-core ratio than the Viridis box.

Penguin Computing's UDX1 ARM server

Penguin Computing's UDX1 ARM server

Loaded up, the UDX1 machine will run around $35,000, with variation depending on CPU speed, memory capacity, and disk options. Penguin Computing says that depending on how modern your x86 clustery is, this 4U chassis can replace anywhere from a quarter to a half rack of x86 iron and switches.

Penguin Computing will be showing off the UDX1 system at Strata Hadoop World next week in New York, apparently with much-awaited benchmarks for Hadoop Big Data munching. It is not clear how the balance of CPU cores, memory, and disk drive spindles will play out on ARM servers, but it could turn out that the number of spindles does not need to be as large per socket as on x86-based machines.

In general, the number of drives per socket has been going up with the number of cores per socket on x86-based machines, and generally speaking, Hadoop machinery likes to have one disk per processor core and you tend to use the fattest disk you can afford. The speed of the drive and the speed of the core take a back seat capacity, given the volumes of data that most Hadoop clusters are wrestling with.

In general, Hadoop clusters are not network I/O or CPU bound, like many traditional supercomputer workloads, but rather disk-bound since you can only make a disk drive spin so fast or hold so much data. (The fatter drives move slower, so there are tradeoffs here, too.) But it will be interesting to see what a balanced Calxeda server might look like.

Based on the Penguin setup, which has 192 cores and 36 drives, it looks like the machine is a little bit light on disks. The wonder is why Penguin Computing didn't build a 5U chassis with 48 drives, one for each EnergyCore socket in the box, and I will ask the company about that when I see the demo next week. The answer might be that you only use eight EnergyCards in the box as Hadoop compute nodes and use the remaining four nodes as NameNodes and other management nodes in the cluster, giving you a socket-to-disk balance.

The UDX1 machines might be configured to be suitable not just for Hadoop, but other Big Data workloads like risk analysis, genomics, and seismic processing where computing oomph is important but so is fast networking and flat networks.

It will be interesting to see what the power draw, performance, and cost is on the UDX1 is running various workloads and how that compares to Xeon and Opteron machinery configured with 10GE ports and switches. ®

Best practices for enterprise data

More from The Register

next story
Sysadmin Day 2014: Quick, there's still time to get the beers in
He walked over the broken glass, killed the thugs... and er... reconnected the cables*
VMware builds product executables on 50 Mac Minis
And goes to the Genius Bar for support
Multipath TCP speeds up the internet so much that security breaks
Black Hat research says proposed protocol will bork network probes, flummox firewalls
Auntie remains MYSTIFIED by that weekend BBC iPlayer and website outage
Still doing 'forensics' on the caching layer – Beeb digi wonk
Microsoft's Euro cloud darkens: US FEDS can dig into foreign servers
They're not emails, they're business records, says court
Microsoft says 'weird things' can happen during Windows Server 2003 migrations
Fix coming for bug that makes Kerberos croak when you run two domain controllers
Cisco says network virtualisation won't pay off everywhere
Another sign of strain in the Borg/VMware relationship?
prev story

Whitepapers

7 Elements of Radically Simple OS Migration
Avoid the typical headaches of OS migration during your next project by learning about 7 elements of radically simple OS migration.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
Solving today's distributed Big Data backup challenges
Enable IT efficiency and allow a firm to access and reuse corporate information for competitive advantage, ultimately changing business outcomes.
A new approach to endpoint data protection
What is the best way to ensure comprehensive visibility, management, and control of information on both company-owned and employee-owned devices?