Original URL: https://www.theregister.com/2013/10/02/cray_turns_cluster_crank/

Cray turns cluster crank with ScaleMP

New products for memory deprived datacenter set

By Dan Olds

Posted in HPC, 2nd October 2013 23:30 GMT

Does your performance in the datacenter suffer because you don’t have enough memory to really get the job done? Do you have apps that don’t perform well on clusters, or don’t parallelize at all? If this describes you or a loved one, read on, because Cray thinks it has the solution for you.

Cray, with partner ScaleMP, recently announced two new systems that aim to cure your memory woes, in distinctly different ways.

CS300 SMP

The CS300 SMP (which stands for ‘Shared Memory Parallel’) combines Cray compute nodes with ScaleMP’s vSMP Foundation software-based virtualization to produce a system with a large number of cores and a very large amount of shared memory space.

The starter system includes 18 nodes, each sporting dual Intel Xeon E5 10-core 2.8 GHz model 2680 processors for a total of 360 cores. All of these nodes are used in a single system image, sharing a whopping 4.75 TB of memory, courtesy of the pre-integrated vSMP software virtualization suite.

The base system includes 32TB of internal storage, sits in a single standard rack, and is rated at 8 TFlop/s. With a single rail FDR Infiniband interconnect (good for 112 Gbp/s), the base model CS300 SMP goes out the door for $287,500. Doubling the interconnect speed to 225 Gbp/s adds $35,000, bringing the total to $322,500.

Cray has an upgraded CS300 SMP model that brings the total system up to 68 nodes (680 cores), 8.75 TB of shared system memory, and 48 TB of internal storage for $480,000 with a 112 Gbp/s Infiniband interconnect. This configuration with the faster 225 Gbp/s interconnect retails for $530,000.

But there’s still quite a bit of headroom beyond the 36 node upgrade model. ScaleMP’s vFoundation SMP software will combine the memory from as many as 128 nodes into a single system image. Customers can also configure their own system using either dual-socket or quad-socket nodes, or even mixing and matching between the two.

CS300 LMS

The LMS (Large Memory System) is aimed at customers who need extreme amounts of memory, but only modest numbers of processor cores.

With traditional SMP systems, to gets lots of memory, you also have to purchase lots of systems boards with processors, I/O, etc. You have to scale out the whole configuration in order to get enough memory to satisfy your needs.

The CS300 LMS, accompanied by ScaleMP’s vSMP Foundation, changes the rules of the game, giving users systems with small core counts, but massive amounts of memory.

For example, the base LMS is a dual-socket (Xeon E5-2690s @3.0 GHz), 20 core system that contains 4.375 TB of usable memory and costs $182,500. For customers who want to experience an even more extreme memory/CPU mismatch, they can buy the two-way LMS with a grand total of 8.375 TB of memory for $295,000.

Cray is also offering an entry level quad-socket LMS option with utilizes 8-core Xeon E5-4650 (@2.7 GHz) with a total of 4.75 TB or 8.75 TB of memory. The cost of the 4-way system is $212,500 or $325,000 for the big memory edition.

These systems fulfill an important need in the HPC and enterprise sides of the industry. Very large shared memory system architectures allow users to put entire datasets into a single massive chunk of speedy main memory.

The performance benefits for some workloads can be profound. With the shared memory systems, latency is measured in nanoseconds rather than the millisecond latencies seen on clusters and other distributed architectures.

The biggest benefits will be seen on workloads that aren’t easily parallelized and/or those that are need to use uber-scads of data (technical term meaning ‘a whole lot’) for workloads like weather forecasting, online transaction processing, or any type of data-centric real-time process.

Huge SMP systems used to dominate both the HPC Top500 list and the enterprise benchmark lists. But they were supplanted by x86 architectures over time, due to cost differences, the rise of Linux as a bonafide HPC operating system, and higher performance made possible by better parallelization and message passing schemes.

Big SMP systems are very expensive to design and manufacture. Building your own proprietary crossbar interconnect and associated big SMP tech, is a very expensive proposition. But SMP systems provide performance advantages that are hard to match with distributed systems.

The beauty of using ScaleMP on a cluster is that large shared memory, single o/s systems can be easily built and then reconfigured at will. When you need a hella large SMP box, you can one with as many as 128 nodes and 256 TB of memory. But when your requirements change, you can morph that large system into 128 individual nodes, or whatever mix of SMP and standalone nodes you need.

While you might be all giddy at the thought of building SMP systems on the fly using small cluster building blocks, you’ll want to do some testing before you jump in. Some applications work better than others under ScaleMP’s software virtualization. A good first step is downloading the free version of its vSMP Foundation software.