Facebook puts some brains in Open Vault JBOD storage

ARM or Atom, pick your embedded CPU and interconnect poison

Boost IT visibility and business value

Open Compute 2013 At last week's Open Compute Summit 2013, the people behind the open source hardware project were showing off some enhancements for the Open Vault JBOD storage array that Facebook has cooked up for its own use in its two newest data centers and presumably will be added to its existing data center.

The Open Vault array, known by its code-name "Knox," has been contributed by the social network to the Open Compute Project open source hardware design effort. Among other things, Open Vault is used for cold storage of the 240 billion photos on the site, which is growing at a rate of 350 million per day, or 7PB per month. (You can see the Open Vault specs here.)

Open Vault is a JBOD array, which means it is just a bunch of disks and is intended to hang off a SAS controller inside of a server. In Facebook's case, that is a custom Open Compute V2 server using Intel's custom "Windmill" two-socket Xeon E5 server node.

The Open Vault array has two 1U disk trays, which each hold fifteen 3.5-inch SAS drives and two SAS expander boards. The four SAS extender boards feed back to the server and make all of the 30 drives in the Open Vault look like they are connected directly to the server. Open Vault is designed so any disk or any one SAS expander can be changed without having to take the JBOD offline.

While Open Vault is great for what it does, it lacks brains. And so, through an extension of the Open Vault called "Knockout," ARM server chip upstart Calxeda and Intel are both working on variants that put some brains and internetworking into each JBOD to turn it into a smarter storage cluster.

Frank Frankovsky, vice president of hardware design and supply chain at Facebook and also chairman of the Open Compute effort, showed off two compute boards that slide into the Open Vault array where the SAS extenders currently fit and give them a bit of brains, like the Scarecrow in the Wizard of Oz.

The first one that Frankovsky showed off – and one that is close to being in production – is an ARM-based compute add-on card that is based on the 32-bit ECX-1000 processor from Calxeda:

A Calxeda ARM server node for the Open Vault JBOD

A Calxeda ARM server node for the Open Vault JBOD

The idea, Gina Longoria, product marketing manager at Calxeda, explains to El Reg is to allow for companies deploying Open Vault storage to possibly run Lustre or Gluster clustered file system code inside each Open Vault tray and maybe only use an x86 node in the rack to run a head node.

The additional computing power could also be used to run other storage software, such as the Ceph distributed object store that is being closely affiliated with OpenStack, or even the Cassandra NoSQL data store that was created by Facebook when it ran up against the limits of MySQL relational databases.

The precise software that can be run on an intelligent storage server is not the point. Giving Open Vault some cheap yet power brains is the point.

Intel wants a piece of this action, too, and is a good buddy of the Open Compute Project as well, and thus Frankovsky was careful to hold up a similar brain transplant card based on Intel's forthcoming "Avoton" Atom S Series processor, which is expected to have on-chip Ethernet links:

An Intel

An Intel "Avoton" Atom server node for Open Vault storage

The feeds and speeds of the Intel board were not divulged, but Calxeda was happy to talk about different configurations of its Knockout compute and networking cards for the Open Vault JBOD.

The Calxeda board has a single ECX-1000 processor with four Cortex-A9 cores running at 1.4GHz and 4GB of DDR3 main memory running at 1.33GHz. The board can have two RJ45 ports running at gigabit speeds and five SATA port mulitpliers, supporting all drives in a single Open Vault tray - you put one in each tray.

The card can be equipped to run software RAID or to run iSCSI target software, mapping from an x86 head node at the top of the rack. You can also have SFP+ or QSFP ports put on this card if you want to spend a little more money.

The current

The current "Knox" Open Vault and its computationally enhanced "Knockout" derivative

Or, if you want to go with the cheaper and better option, you could use CX4 connectors and use the on-chip distributed Layer 2 network on the ECX-1000 chips to be real clever. First, you could put a 24-port Gigabit Ethernet switch between the Windmill head node and the computationally enhanced Open Vault JBODs.

This switch would link the JBODs to other Windmill head nodes for redundancy, eliminating a single point of failure in the rack. Then you could add data compression, hashing, or other algorithms on the local ARM nodes, or Atom-based nodes, too.

By tucking the ECX-1000 server nodes into the Open Vault JBODs, however, you can do one other thing: cross-couple the arrays and their compute nodes across racks. Here's one example:

How you might use the interconnect on ECX-1000s to do a 2D torus between storage JBODs

How you might use the interconnect on ECX-1000s to do a 2D torus between storage JBODs

With the variant of the Knockout server board that has four 10GE ports coming off the ECX-1000 chip, you can turn on the integrated Fleet Services fabric and use the top of rack switch to handle north-south traffic out of the array and to the network, feeding applications, and use the Fleet Services interconnect to provide data replication and other services on an east-west network that spans multiple racks.

All of this can happen under the covers of the Open Vault and behind the scenes where the head node in a storage cluster is blissfully unaware. This would also mean that the x86 node could potentially be quite a bit less powerful in terms of memory and CPU capacity, and in fact, you could have an array of ARM servers acting as the head node if you wanted, according to Longoria.

This is, of course, the option that Calxeda is excited about. ®

The essential guide to IT transformation

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Oracle reveals 32-core, 10 BEEELLION-transistor SPARC M7
New chip scales to 1024 cores, 8192 threads 64 TB RAM, at speeds over 3.6GHz
Docker kicks KVM's butt in IBM tests
Big Blue finds containers are speedy, but may not have much room to improve
US regulators OK sale of IBM's x86 server biz to Lenovo
Now all that remains is for gov't offices to ban the boxes
Gartner's Special Report: Should you believe the hype?
Enough hot air to carry a balloon to the Moon
Flash could be CHEAPER than SAS DISK? Come off it, NetApp
Stats analysis reckons we'll hit that point in just three years
Dell The Man shrieks: 'We've got a Bitcoin order, we've got a Bitcoin order'
$50k of PowerEdge servers? That'll be 85 coins in digi-dosh
prev story


5 things you didn’t know about cloud backup
IT departments are embracing cloud backup, but there’s a lot you need to know before choosing a service provider. Learn all the critical things you need to know.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.
Rethinking backup and recovery in the modern data center
Combining intelligence, operational analytics, and automation to enable efficient, data-driven IT organizations using the HP ABR approach.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.