Feeds

Adaptive Computing speaks better GPU with Moab 6

Torque is cheap

The essential guide to IT transformation

SC10 Supercomputer clusters are getting larger every year, and now they are getting math help from adjunct devices such as GPU co-processors.

Cluster provisioning and management tools therefore have to scale from tens of thousands of cores to hundreds of thousands – without choking on their own communication with cluster nodes. They also have to be aware of co-processors and keep them fed.

Scalability and GPU support are therefore the key new features in the Moab Cluster Suite 6.0 cluster management tool and its companion Moab Viewpoint 2.0 console.

On the scalability front, the current Moab 5.4 release is sufficient to scale on clusters with 40,000 to 50,000 nodes, but Peter ffolkes, vice president of marketing for Adaptive Computing, says it won't be long before clusters are scaling up to many more nodes and multiple millions of processor cores, which presents the cluster-management tool with a much larger communication issue. This is why the company gutted the underlying communication system linking the Moab console to the server nodes and their computing resources to make it more efficient and therefore improve the response time of the tool. Specifically, Moab has enhanced the multi-threading of its own Moab communication stack, which is written in C, so the probes that track the performance of the nodes and cores out there in the cluster do not interfere with the scheduler at the heart of the Moab tool. The streamlined node communication protocols run 100 times faster, and the result is that it takes less resources to run Moab 6 on a cluster of a given size, compared to Moab 5.4, the current release.

By the way, Adaptive Computing has not had to change the node-count scalability above that 50,000-node level because, as ffolkes puts it, "no one is getting anywhere near that yet."

So the scalability improvements with Moab 6.0 are about how the tool feels when it is running on growing clusters. It will be less sluggish in terms of response time, but absolute scaling is the same.

The new Moab 6.0 tool also has much-improved GPU management capabilities, which is an increasing requirement at many HPC shops. Moab 5.4 was able to designate an x64 server node in a cluster as one with a GPU, but system designers are getting clever about how they lash GPUs to servers and there is not always a permanent and one-to-one relationship between the CPUs and the GPUs. Some nodes in a cluster have multiple GPUs, and NextIO, Dell, and others are making special outboard GPU enclosures that can be configured to multiple server nodes and changed on the fly.

Obviously, a cluster job scheduler needs to be able not only to see the GPUs, but configure the GPUs to specific nodes and then dispatch work to them. To that end, Moab 6.0 includes the Torque 2.5.4 open source resource manager, which allows Moab 6.0 to gather up detailed information about the GPUs and how they can be configured to servers.

Finally, Moab 6.0 includes an updated Viewpoint 2.0 Web-based management console, which is bringing over more features from the company's prior Moab Access Portal and Control Manager fat client, Java-based management console. With the Viewpoint 2.0 release, ffolkes says that most of the commands from the old tools are in the new one, plus the additional features to manage GPUs and virtual machines in cloudy infrastructure. The new tool is written in a mix of Java and the Google Web Toolkit (GWT). Among other things, the Viewpoint console can be now used to manage physical and virtual nodes in large-scale HPC or commercial clusters and can be used to kick off migrations of virtual machines around a cluster or the moving of workloads from one physical server to another.

Moab Cluster Suite 6.0 runs on Linux-based servers, and one machine can manage a cluster with tens of thousands of nodes. For larger installations, you can federate Moab controller servers and carve a cluster up into domains for each Moab machine to manage.

Moab's Adaptive Computing Suite extensions to the core Moab Cluster Suite can manage both Linux and Windows HPC Server 2008 R2 images. Moab Cluster suite costs under $100 per server socket, with Adaptive Computing Suite costing under $300 per socket, according to ffolkes, He said the Moab stack usually represents somewhere between 3 to 5 per cent of a cluster node cost. ®

Boost IT visibility and business value

More from The Register

next story
Pay to play: The hidden cost of software defined everything
Enter credit card details if you want that system you bought to actually be useful
Shoot-em-up: Sony Online Entertainment hit by 'large scale DDoS attack'
Games disrupted as firm struggles to control network
HP busts out new ProLiant Gen9 servers
Think those are cool? Wait till you get a load of our racks
Silicon Valley jolted by magnitude 6.1 quake – its biggest in 25 years
Did the earth move for you at VMworld – oh, OK. It just did. A lot
VMware's high-wire balancing act: EVO might drag us ALL down
Get it right, EMC, or there'll be STORAGE CIVIL WAR. Mark my words
Forrester says it's time to give up on physical storage arrays
The physical/virtual storage tipping point may just have arrived
prev story

Whitepapers

Top 10 endpoint backup mistakes
Avoid the ten endpoint backup mistakes to ensure that your critical corporate data is protected and end user productivity is improved.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Backing up distributed data
Eliminating the redundant use of bandwidth and storage capacity and application consolidation in the modern data center.
The essential guide to IT transformation
ServiceNow discusses three IT transformations that can help CIOs automate IT services to transform IT and the enterprise
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.