Feeds

Numascale brings big iron SMP to the masses

The NUMA NUMA song

Next gen security for virtualised datacentres

SC10 If you want big server iron but you have midrange server budgets, Numascale has an adapter card that it wants to sell to you. The NumaConnect SMP card turns a cluster of Opteron servers into a shared memory system, and in the not-too-distant future, probably Xeon-based machines, too.

The clustering of cheap server nodes to make fatter shared memory systems is one of those recurring themes in the systems racket. IBM's Power Systems and high-end System x cards used Sequent-derived chipsets to turn up to four physical servers into one shared memory machine. All of the new four-socket and eight-socket boxes based on Xeon 7500 processors from Intel do the same trick, which uses non-uniform memory access (NUMA) clustering to give multiple server cards in a single system a shared memory space.

There are many other designs that make use of NUMA or NUMA-like technologies to lash cheap nodes together make a single address space for applications to frolick in. The problem is, big iron with a shared memory space is expensive while distributed clusters are cheap, even if applications have to be parallelized to run on the latter machines. Ideally, you would be able to take cheaper clusters and make them look like expensive shared memory systems without actually doing much work.

This is an idea that researchers at the University of Oslo and a spinoff, Dolphin Interconnect Solutions, has been chasing for two decades. The university researchers in Norway did a lot of the work to help forge a standard called the Scalable Coherent Interface, or SCI, that was supposed to be a high-speed, point-to-point interconnect for all components in a system.

Data General and Sun Microsystems used Dolphin SCI chips in some of their systems back in the day, and 3Leaf Systems had a similar ASIC and interconnect, but according to rumors going around SC10 last week the company quietly went out of business earlier this year. (No one is answering the phones at 3Leaf Systems, so we can't confirm this.) Dolphin has still sells SCI-based embedded systems for military and industrial systems and is hoping to take the technology to a broader HPC and enterprise market.

But to go after the modern commercial server racket, Dolphin has spun out a new company called Numascale in 2008 and has put the finishing touches on a single-chip implementation of its cache-coherent NUMA technology. With the NumaConnect SMP adapter card, which plugs into the HTX expansion slot of Opteron-based machines, an Opteron-based server is converted into a NUMA cluster. According to Einar Rustad, co-founder of Numascale and vice president of business development at the company, the SCI interconnect inside the NumaConnect SMP adapter runs at 20 Gb/sec, which is half the rate of QDR InfiniBand and twice that of 10 Gigabit Ethernet, obviously.

That's not what matters so much when it comes to NUMA clustering. The latency hopping from node to node in a shared-memory system using the NumaConnect SMP cards is somewhere between 1 and 1.5 microseconds, which is low enough that with proper caching a cluster of server nodes can be made to look like one giant SMP box like a high-end mainframe, RISC/Unix box, or x64 box using a Xeon 7500 and Intel's "Boxboro" 7500 chipset. The thing is, Numascale is letting you create a big bad box out of cheaper server nodes. And, because the electronics behind the Dolphin technology has been shrunk down to a single chip, it is a lot cheaper to make and therefore to sell.

The NumaChip implements NUMA clustering using a director-based cache coherence protocol with a write-back cache and a tag memory cache. The write-back cache keeps data pulled from adjacent server nodes around as it is used so the next time a node asks for it, the request doesn't have to go any further than the NumaConnect card. The tag memory is what is used to create the single, global address space that all of the other server nodes see when they are linked to each other. You have to match the server tag memory to the capacity of the memory on the Opteron server node.

The essential guide to IT transformation

Next page: Like a Dolphin

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Death by 1,000 cuts: Mainstream storage array suppliers are bleeding
Cloud, all-flash kit, object storage slicing away at titans of storage
US regulators OK sale of IBM's x86 server biz to Lenovo
Now all that remains is for gov't offices to ban the boxes
Oracle reveals 32-core, 10 BEEELLION-transistor SPARC M7
New chip scales to 1024 cores, 8192 threads 64 TB RAM, at speeds over 3.6GHz
VMware vaporises vCHS hybrid cloud service
AnD yEt mOre cRazy cAps to dEal wIth
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
7 Elements of Radically Simple OS Migration
Avoid the typical headaches of OS migration during your next project by learning about 7 elements of radically simple OS migration.
BYOD's dark side: Data protection
An endpoint data protection solution that adds value to the user and the organization so it can protect itself from data loss as well as leverage corporate data.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?