Feeds

Platform Computing doubles up cluster management

Bigger grids, same price, and GPUs too

Combat fraud and increase customer satisfaction

Supercomputer clusters are getting larger and larger, and that is Platform Computing has to revamp its Load Sharing Facility to version 8 and double up the capacity of the workload scheduling software for grids and clusters. The updated LSF also supports GPU co-processors as full citizens of the cluster.

With LSF 7, Platform Computing could manage a cluster that had 24,000 cores and on the order of 100,000 pending jobs, according to Ken Hertzler, vice president of product management at the grid computing pioneer. With LSF 8, which will start shipping in January 2011, a single instance of the cluster management tool will be able to span a cluster comprised of 48,000 cores and 200,000 pending jobs. And if you need to span larger cluster sizes, you can gang up multiple LSF 8 instances to control grids that have 100,000 cores and up to 1.5 million pending jobs.

This may seem like plenty of scalability, but Hertzler says that Platform Computing already has a couple of accounts that have clusters that range from 50,000 to 70,000 cores, so the doubling up of cluster scalability for LSF is not just a matter of providing lots of headroom to most customers. With core counts on the rise in x64 processors from Intel and Advanced Micro Devices to the tune of 30 per cent or so in the coming year and companies simultaneously adding more nodes to clusters, Platform Computing has to broaden its core and pending job counts. In fact, it won't be long before Platform Computing has to jack up the core counts some more.

LSF 8 is more than a tweaked version of the code with twice the cluster scalability, and Hertzler says it is the first major release of the product since LSF 7 shipped four years ago. And now it speaks GPU as well as CPU.

Platform Computing's entry and midrange cluster management tool, Platform HPC 2.1, was announced last week ahead of the SC10 supercomputing conference, and it was the first program put out by the company to be able to directly schedule jobs on GPU co-processors. Now the full-on LSF scheduler, which is the flagship product from Platform Computing, has this capability. With the GPU support in LSF 8, jobs can be dispatched to them directly and the scheduler has smarts to see utilization and thermals for the GPUs so it can distribute workloads to avoid creating hot spots in the cluster.

Whether or not you use CPUs or a mix of CPUs and GPUs in your workloads (you can't actually run an operating system and applications directly on a GPU - yet), LSF 8 has a number of performance and scalability enhancements that can help boost the utilization on your clusters. And important new feature is called guaranteed resources, which is designed to make sure jobs get the resources they need to run to meet the service levels agreements that people require when they submit jobs. Because resources could not be guaranteed in prior releases, cluster administrators often had to carve their clusters up into silos, with higher priority jobs locking up resources that are often just sitting there, waiting for their job to start and lower priority jobs not finishing as quickly as they might had they had short-term access to those siloed resources.

With guaranteed resources, which are driven by SLAs set by cluster administrators, the scheduler finds the best way to meet the SLAs without partitioning up the cluster. The scheduler also now has pre-emptive and fair-share scheduling policies, which allows LSF to pre-empt jobs and steal resources temporarily from one job to help meet one SLA while at the same time allowing the second job to meet its SLAs. Basically, the software lets a bunch of small jobs say: "Hold on a minute until I finish and then you can have a lot more CPUs, big job."

The performance improvements moving from LSF 7 to LSF 8 on a given cluster will vary by jobs and system configuration, and there won't be much of an improvement if customers are already up near 100 per cent utilization. But Hertzler says for those customers who are maybe able to get 60 to 70 per cent utilization on their clusters running a large number of mixed workloads, they might be able to squeeze another 10 to 20 per cent utilization out of their clusters (and therefore get the same work done in a shorter period of time), and that is a significant improvement.

LSF 8 also has a new administrative rights delegation feature, which gets the cluster administrator out of the politics of who gets to use what cluster when. Now, supercomputer center or business line managers who have access to the cluster can add and remove users from the list of people who have access to the cluster to submit jobs and determine the service level they want for specific jobs. The LSF administrator then gets back to the job of managing the cluster, not answering cranky phone calls from people who all think they deserve special treatment.

LSF 8 can dispatch work to clusters running various Linuxes, Unixes, and Windows operating systems as well as Mac OS X; you can see a full list of the supported platforms here. LSF 8 has the same price as LSF 7, and customers on a support contract with Platform Computing can upgrade at no charge. While Platform Computing provides pricing for its HPC 2.1 stack, it does not reveal its prices for the LSF tool, except to say it charges on a per-core basis with site-wide (and presumably volume discounted) licenses available. ®

3 Big data security analytics techniques

More from The Register

next story
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Kingston DataTraveler MicroDuo: Turn your phone into a 72GB beast
USB-usiness in the front, micro-USB party in the back
AMD's 'Seattle' 64-bit ARM server chips now sampling, set to launch in late 2014
But they won't appear in SeaMicro Fabric Compute Systems anytime soon
Brit boffins use TARDIS to re-route data flows through time and space
'Traffic Assignment and Retiming Dynamics with Inherent Stability' algo can save ISPs big bucks
Microsoft's Nadella: SQL Server 2014 means we're all about data
Adds new big data tools in quest for 'ambient intelligence'
prev story

Whitepapers

Mobile application security study
Download this report to see the alarming realities regarding the sheer number of applications vulnerable to attack, as well as the most common and easily addressable vulnerability errors.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.
Combat fraud and increase customer satisfaction
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.