Numascale brings big iron SMP to the masses

The NUMA NUMA song

5 things you didn’t know about cloud backup

SC10 If you want big server iron but you have midrange server budgets, Numascale has an adapter card that it wants to sell to you. The NumaConnect SMP card turns a cluster of Opteron servers into a shared memory system, and in the not-too-distant future, probably Xeon-based machines, too.

The clustering of cheap server nodes to make fatter shared memory systems is one of those recurring themes in the systems racket. IBM's Power Systems and high-end System x cards used Sequent-derived chipsets to turn up to four physical servers into one shared memory machine. All of the new four-socket and eight-socket boxes based on Xeon 7500 processors from Intel do the same trick, which uses non-uniform memory access (NUMA) clustering to give multiple server cards in a single system a shared memory space.

There are many other designs that make use of NUMA or NUMA-like technologies to lash cheap nodes together make a single address space for applications to frolick in. The problem is, big iron with a shared memory space is expensive while distributed clusters are cheap, even if applications have to be parallelized to run on the latter machines. Ideally, you would be able to take cheaper clusters and make them look like expensive shared memory systems without actually doing much work.

This is an idea that researchers at the University of Oslo and a spinoff, Dolphin Interconnect Solutions, has been chasing for two decades. The university researchers in Norway did a lot of the work to help forge a standard called the Scalable Coherent Interface, or SCI, that was supposed to be a high-speed, point-to-point interconnect for all components in a system.

Data General and Sun Microsystems used Dolphin SCI chips in some of their systems back in the day, and 3Leaf Systems had a similar ASIC and interconnect, but according to rumors going around SC10 last week the company quietly went out of business earlier this year. (No one is answering the phones at 3Leaf Systems, so we can't confirm this.) Dolphin has still sells SCI-based embedded systems for military and industrial systems and is hoping to take the technology to a broader HPC and enterprise market.

But to go after the modern commercial server racket, Dolphin has spun out a new company called Numascale in 2008 and has put the finishing touches on a single-chip implementation of its cache-coherent NUMA technology. With the NumaConnect SMP adapter card, which plugs into the HTX expansion slot of Opteron-based machines, an Opteron-based server is converted into a NUMA cluster. According to Einar Rustad, co-founder of Numascale and vice president of business development at the company, the SCI interconnect inside the NumaConnect SMP adapter runs at 20 Gb/sec, which is half the rate of QDR InfiniBand and twice that of 10 Gigabit Ethernet, obviously.

That's not what matters so much when it comes to NUMA clustering. The latency hopping from node to node in a shared-memory system using the NumaConnect SMP cards is somewhere between 1 and 1.5 microseconds, which is low enough that with proper caching a cluster of server nodes can be made to look like one giant SMP box like a high-end mainframe, RISC/Unix box, or x64 box using a Xeon 7500 and Intel's "Boxboro" 7500 chipset. The thing is, Numascale is letting you create a big bad box out of cheaper server nodes. And, because the electronics behind the Dolphin technology has been shrunk down to a single chip, it is a lot cheaper to make and therefore to sell.

The NumaChip implements NUMA clustering using a director-based cache coherence protocol with a write-back cache and a tag memory cache. The write-back cache keeps data pulled from adjacent server nodes around as it is used so the next time a node asks for it, the request doesn't have to go any further than the NumaConnect card. The tag memory is what is used to create the single, global address space that all of the other server nodes see when they are linked to each other. You have to match the server tag memory to the capacity of the memory on the Opteron server node.

Build a business case: developing custom apps

Next page: Like a Dolphin

More from The Register

next story
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Shoot-em-up: Sony Online Entertainment hit by 'large scale DDoS attack'
Games disrupted as firm struggles to control network
Silicon Valley jolted by magnitude 6.1 quake – its biggest in 25 years
Did the earth move for you at VMworld – oh, OK. It just did. A lot
VMware's high-wire balancing act: EVO might drag us ALL down
Get it right, EMC, or there'll be STORAGE CIVIL WAR. Mark my words
Forrester says it's time to give up on physical storage arrays
The physical/virtual storage tipping point may just have arrived
Better be Nimble, tech giants, or mutant upstarts will make off with your sales
Usual suspects struggling to create competing products
VMware vaporises vCHS hybrid cloud service
AnD yEt mOre cRazy cAps to dEal wIth
prev story


A new approach to endpoint data protection
What is the best way to ensure comprehensive visibility, management, and control of information on both company-owned and employee-owned devices?
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Maximize storage efficiency across the enterprise
The HP StoreOnce backup solution offers highly flexible, centrally managed, and highly efficient data protection for any enterprise.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.