Feeds

Adaptive Computing speaks better GPU with Moab 6

Torque is cheap

Boost IT visibility and business value

SC10 Supercomputer clusters are getting larger every year, and now they are getting math help from adjunct devices such as GPU co-processors.

Cluster provisioning and management tools therefore have to scale from tens of thousands of cores to hundreds of thousands – without choking on their own communication with cluster nodes. They also have to be aware of co-processors and keep them fed.

Scalability and GPU support are therefore the key new features in the Moab Cluster Suite 6.0 cluster management tool and its companion Moab Viewpoint 2.0 console.

On the scalability front, the current Moab 5.4 release is sufficient to scale on clusters with 40,000 to 50,000 nodes, but Peter ffolkes, vice president of marketing for Adaptive Computing, says it won't be long before clusters are scaling up to many more nodes and multiple millions of processor cores, which presents the cluster-management tool with a much larger communication issue. This is why the company gutted the underlying communication system linking the Moab console to the server nodes and their computing resources to make it more efficient and therefore improve the response time of the tool. Specifically, Moab has enhanced the multi-threading of its own Moab communication stack, which is written in C, so the probes that track the performance of the nodes and cores out there in the cluster do not interfere with the scheduler at the heart of the Moab tool. The streamlined node communication protocols run 100 times faster, and the result is that it takes less resources to run Moab 6 on a cluster of a given size, compared to Moab 5.4, the current release.

By the way, Adaptive Computing has not had to change the node-count scalability above that 50,000-node level because, as ffolkes puts it, "no one is getting anywhere near that yet."

So the scalability improvements with Moab 6.0 are about how the tool feels when it is running on growing clusters. It will be less sluggish in terms of response time, but absolute scaling is the same.

The new Moab 6.0 tool also has much-improved GPU management capabilities, which is an increasing requirement at many HPC shops. Moab 5.4 was able to designate an x64 server node in a cluster as one with a GPU, but system designers are getting clever about how they lash GPUs to servers and there is not always a permanent and one-to-one relationship between the CPUs and the GPUs. Some nodes in a cluster have multiple GPUs, and NextIO, Dell, and others are making special outboard GPU enclosures that can be configured to multiple server nodes and changed on the fly.

Obviously, a cluster job scheduler needs to be able not only to see the GPUs, but configure the GPUs to specific nodes and then dispatch work to them. To that end, Moab 6.0 includes the Torque 2.5.4 open source resource manager, which allows Moab 6.0 to gather up detailed information about the GPUs and how they can be configured to servers.

Finally, Moab 6.0 includes an updated Viewpoint 2.0 Web-based management console, which is bringing over more features from the company's prior Moab Access Portal and Control Manager fat client, Java-based management console. With the Viewpoint 2.0 release, ffolkes says that most of the commands from the old tools are in the new one, plus the additional features to manage GPUs and virtual machines in cloudy infrastructure. The new tool is written in a mix of Java and the Google Web Toolkit (GWT). Among other things, the Viewpoint console can be now used to manage physical and virtual nodes in large-scale HPC or commercial clusters and can be used to kick off migrations of virtual machines around a cluster or the moving of workloads from one physical server to another.

Moab Cluster Suite 6.0 runs on Linux-based servers, and one machine can manage a cluster with tens of thousands of nodes. For larger installations, you can federate Moab controller servers and carve a cluster up into domains for each Moab machine to manage.

Moab's Adaptive Computing Suite extensions to the core Moab Cluster Suite can manage both Linux and Windows HPC Server 2008 R2 images. Moab Cluster suite costs under $100 per server socket, with Adaptive Computing Suite costing under $300 per socket, according to ffolkes, He said the Moab stack usually represents somewhere between 3 to 5 per cent of a cluster node cost. ®

The essential guide to IT transformation

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Oracle reveals 32-core, 10 BEEELLION-transistor SPARC M7
New chip scales to 1024 cores, 8192 threads 64 TB RAM, at speeds over 3.6GHz
Docker kicks KVM's butt in IBM tests
Big Blue finds containers are speedy, but may not have much room to improve
US regulators OK sale of IBM's x86 server biz to Lenovo
Now all that remains is for gov't offices to ban the boxes
Flash could be CHEAPER than SAS DISK? Come off it, NetApp
Stats analysis reckons we'll hit that point in just three years
Object storage bods Exablox: RAID is dead, baby. RAID is dead
Bring your own disks to its object appliances
prev story

Whitepapers

5 things you didn’t know about cloud backup
IT departments are embracing cloud backup, but there’s a lot you need to know before choosing a service provider. Learn all the critical things you need to know.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.
Rethinking backup and recovery in the modern data center
Combining intelligence, operational analytics, and automation to enable efficient, data-driven IT organizations using the HP ABR approach.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.