COMA chameleons: The Reg goes inside Sun's Serengheti

So it's COMA vs ZzzzSeries. But which one is cash coherent?

  • alert
  • submit to reddit

Secure remote control for conventional and virtual desktops

Exclusive Sources familiar with Sun's Serengheti servers have given The Register a low down on the architecture of its next generation big iron servers.

Sun has widely been tipped to adopt a NUMA-like architecture for the servers at the high end, given that the scalability of pure SMP implementations is um, contentious above 64-CPU systems. The SMP programming model has been Sun's bedrock for a decade, and given the slow-handclap that greeted the first, much-hyped cache-coherent NUMA systems three or four years ago, it's been evidently been working hard to find a model that scales well, but doesn't bear the enduring NUMA stigma.

And so it's turned to an academically proven but as yet commercially unsuccessful variation on distributed shared memory, Simple COMA, standing cache-only memory architecture. There's COMA and there's Simple COMA, but the way we do it, says Sun with some justification, it ain't NUMA, And how it's squared the circle we can reveal for the first time.

"It's rather COMAish than NUMAish" confirms our mole. "It's a simple COMA implementation with some delicate OS extensions, like Coherent Memory Replication and Affinity Scheduling."

Before charging off into the gory details of Serengheti, s-COMA as we understand it goes something like this. NUMA systems - collections of SMPs - are dogged by latencies when data can't be found in a local cache, but has to be fetched from remote memory. So it's over the interconnect you go, sonny, and don't forget to negotiate timings when you get back. And snoopy buses, which are supposed to police this traffic in SMPs are getting fiendishly complicated in their own right.

So COMAs try to solve this problem by treating a cache miss rather like a virtual memory page fault, by reserving memory "pages" on the local machine. This involves an overhead on the first cache miss, but thereafter, the local page becomes a proxy for the remote cache, without the overhead. It's what S-COMA mavens refer to as "attraction memory".

Clues emerged this spring when UltraSPARC III's architecture was published revealing unfeasibly large cache lines which could optimize a NUMA or COMA like design. But that's by the by, say our sources, as Sun's Serengheti moves almost of the graft into software.

"Basically everything is in software. The COMA controller is intelligent, but above the physical transfer and cache consistency management, it really doesn't do too much."

Infiniband wasn't ready in time for Serengheti, and as with HP, IBM and SGI, that doesn't seem to have bothered the designers, according to our source:

"The interconnect uses a dual (one for each direction) fibre with 12 channels (each 1Gbit/sec) multiplexed to a single fibre. It could be used for cascading, load balancing etc. The more interesting things are actually in the OS kernel. Like the Coherent Memory Replication and the Affinity Scheduling.

And therse are software for a good reason say our sources: migrating them to hardware would be expensive and pose scalability problems.

"This is ordinary SMP for the regular programmers perspective. The kernel programmer sees it a bit different. OS keeps track of page access patterns and migrates pages coherently between nodes. This is CMR. AS allocates pages and processes to minimize page migration. Which means, it tries to bind processes to those CPU's, which are "near" to the memory, they are using. "Near" in this case is a specialized metric. This is also a good for minimizing cache pollution. Both things are working in an autonomous way. The scalability and performance linearity is more than promising, if both features are turned on."

So there you have it. We note that Sun's project lead is COMA pioneer Erik Hagersten who created one of the first COMA implementations with his Data Diffusion Machine (DDM) as long ago as 1992. With Sun's hardware team a phone call away, he's had plenty of resources and time - four years we gather - to make this hum.

Not everyone is convinced by the COMA architecture. We were chatting to Jonathan Eunice of analysts Illuminata who professed himself "incredibly sceptical" about how practical COMA would really be.

"It can work very well if you can partition the application - it can work wonderfully. For example SAP R/3 is incredibly partitionable - but most apps aren't" he noted. And COMA was nearer MPP models than NUMA, really. We'll know next year, when vapour becomes metal, and the benchmarks are published.
In the meantime, bring on the marketdroids... ®

Register bootnote:It hasn't escaped our notice that the COMA machines will be pitched against IBM's former S/390, now renamed the ZzzzSeries e-Server. What subliminal marketing: if you're not asleep now, you soon will be...

Related stories

You read it here first:

Sun's Serengeti brain dead yeti?
Sun steps up to the SAN trough
Sun debuts UltraSPARC III and embraces copper
Sun versus Intel: war declared

Providing a secure and efficient Helpdesk

More from The Register

next story
The 'fun-nification' of computer education – good idea?
Compulsory code schools, luvvies love it, but what about Maths and Physics?
Facebook, Apple: LADIES! Why not FREEZE your EGGS? It's on the company!
No biological clockwatching when you work in Silicon Valley
Happiness economics is bollocks. Oh, UK.gov just adopted it? Er ...
Opportunity doesn't knock; it costs us instead
Ex-US Navy fighter pilot MIT prof: Drones beat humans - I should know
'Missy' Cummings on UAVs, smartcars and dying from boredom
Yes, yes, Steve Jobs. Look what I'VE done for you lately – Tim Cook
New iPhone biz baron points to Apple's (his) greatest successes
Lords take revenge on REVENGE PORN publishers
Jilted Johns and Jennies with busy fingers face two years inside
Sysadmin with EBOLA? Gartner's issued advice to debug your biz
Start hoarding cleaning supplies, analyst firm says, and assume your team will scatter
Edward who? GCHQ boss dodges Snowden topic during last speech
UK spies would rather 'walk' than do 'mass surveillance'
Doctor Who's Flatline: Cool monsters, yes, but utterly limp subplots
We know what the Doctor does, stop going on about it already
prev story


Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
Three 1TB solid state scorchers up for grabs
Big SSDs can be expensive but think big and think free because you could be the lucky winner of one of three 1TB Samsung SSD 840 EVO drives that we’re giving away worth over £300 apiece.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.