ScaleMP creates $10k four-socket box out of thin air
Well, out of Infiniband and virtualization
Consider this like a Sesame Street episode for Symmetric Multi-Processor servers. "ScaleMP knows big. Now it wants to show you small."
For the last couple of years, ScaleMP has sold software which turns smaller x86 servers into a single, hulking SMP similar to systems more common in the Unix realm. It can lash together boxes with up to 32 sockets and 1TB of memory. But now it wants to do something rather different.
Customers can take a trimmed down version of ScaleMP's software to link a pair of two socket servers via Infiniband, forming a four socket machine with shared memory.
Why not just buy a proper four socket server?
ScaleMP is glad you asked.
First off, the two-socket units make up the majority of the x86 server market, meaning that customers benefit from volume price breaks and the attention of server vendors. In addition, processor makers will sometimes throw their latest and greatest technology at the two-socket systems first. Intel, for example, offers much higher front side bus speeds on average with its Xeon chips aimed at two-socket servers than it does with the Xeon MP line for larger systems.
ScaleMP believes that the new vSMP Foundation Standalone software moves the average selling price of a four-socket server to less than $10,000.
The company relies on partners to move its software. VXtech and Flextroincs create larger systems that have the cooling and memory requirements best suited to the full blown ScaleMP vSMP Foundation Embedded software. VXtech, for example, sells the Fusion 1200 system with up to 32 processors.
Dell also resells the VXtech system in Europe, and SGI offers the box as well.
During an interview, Shai Fultheim, CEO at ScaleMP, threatened that the larger server vendors will jump on board with the new software, saying that "two and three letter vendors" may be announced as partners in the coming months. But, you know, we'll believe it when we see it.
The ScaleMP software will look across both two-socket machines and talk the operating system into thinking that it's dealing with a single system that has four-sockets. The software does this by having the BIOS boot up and then setting up a virtual machine monitor. Customers do, however, need to have systems with the same amount of memory and equal speed chips, otherwise the ScaleMP code will rely on the box with the fastest chips and turn the leftover processors into memory controllers.
The standalone software starts at $2,750. You'll find more information here on the company's super clumsy web site. Come on, guys, windmills? Really? You sell server software.
ScaleMP seems to make some rather impressive code. Companies such as Rackable Systems used to hawk the software with vigor, and obviously Dell and SGI have shown interest. We're still, however, curious about why ScaleMP is so quiet and content to lurk in the background with the likes of VXtech.
Perhaps the two-socket-to-two-socket show will finally boost ScaleMP to the next level. How does the $10k, four-socket magic sound to you? ®
Nothing to do with Virtual Iron, but the key is that becasue it isn't a standard NUMA architecture such as SGI Altix thelatency question is not relevant. The COMA memory chunk, caches Gigabytes of data. Predictive algorithms prefetch data blocks so that when computation is needed they are already resident in memory, hence no IB latency issue. The net issue is MPIcodesat the same speed as an IB cluster, support for large shared memory jobs, without cluster management complexity.
As far as the IB is converned the system uses standard Mellanox IB, including ConnectX if it shipping yet.
The larger, 8-32 socket system uses an internal IB switch to connect servers together.
Best MPI latency is about 1us
The best Mellanox ConnectX MPI latency I have seen is about 1.2 microseconds with a switch in between. The switch adds about 200 nanoseconds to the mix, so back to back (the configuration of the ScaleMP system) would be about 1 microsecond, assuming similar performance to MPI.
Certainly NUMA memory management in the OS helps, and I assume the COMA aspects are provided by the hypervisor. This could lead to some unhelpful double buffering, unless the OS kernel is aware of the underlying COMA caching.
This system sounds similar to what Virtual Iron was pitching in the past, but abandoned. Is there a relationship between ScaleMP and Virtual Iron??
Also, is this InfiniBand based approach what is under the covers of the larger ScaleMP systems?
to anonymous coward
Sensible question, but It's a hybrid NUMA/COMA architercture, so the latency is hidden.