So you want to build the next Google. Who ya gonna call? Er, Big Blue?

IBM's cluster scheduler kicks OpenStack's Nova in teeth, eyes VMware

Choosing a cloud hosting partner with confidence

Analysis IBM has announced a new version of its Platform Resource Scheduler (PRS), which lines up jobs and resources in mammoth OpenStack Havana environments.

In doing so, Big Blue hopes to give enterprises a shot at achieving the same levels of efficiency as Google's highly tuned servers.

Though the tech competes against VMware's Distributed Resource Scheduler, it could become a credible general-purpose job scheduler to rival Google's secretive Borg and Omega systems, and the Apache Mesos project.

A resource scheduler and workload placer is a system that takes jobs, and figures out when to run them and where to run them to maximize IT utilization. It must also leave some spare capacity, rather than consume all the available infrastructure, to ensure there's redundancy to pick up from any failures. And it must hit its deadlines.

Google's Borg system is rumored to have been so good at this task juggling act that it saved the ad-slinger from building an entire data center.

IBM's resource scheduling tech is designed as a drop-in replacement for the scheduler within the Nova component of the open-source cloud manager OpenStack. Nova makes scheduling decisions according to information it stores during its setup, and it selects jobs for compute nodes whose configurations match various filters.

PRS, by contrast, uses the distributed agent framework in Big Blue's Platform Computing products, which considers realtime "machine and hypervisor loads" among other information when making decisions. Thus, PRS can look at the available compute capacity in realtime and make ongoing judgements when placing workloads. It can shift things around as needed using the underlying hypervisor's live migration ability.

"This means that as workloads and resources evolve, workload placement is automatically re-balanced," IBM marketing chap Gord Sissons told The Reg via email.

"The key benefits are: better quality of service in terms of performance and availability, because hypervisors are less likely to be over-subscribed; better utilization, since [virtual machines] can be packed more optimally while respecting service level requirements; and reduced administrator workload, since the re-balancing is automated.

"This is important as OpenStack environments get large. The real 'intellectual property' in the offering is in the pre-configured policies - the idea is that a cloud administrator can simply specify a policy like 'load balancing' or 'packing', and the scheduler will automatically seek to achieve the goal of the policy."

It'll babysit your 50,000 cores. If you can afford it

It's worth noting that this system is unlikely to have the capabilities of Google's Omega system, which is believed to draw on CPU-core-level telemetry from a system named CPI2, along with other Chocolate Factory innovations.

However, by drawing on other IBM technology such as Platform Symphony, it is able to gain some advanced abilities, such as the aforementioned distributed agent-based scheduling, which (we're told) lets IBM's tech "opportunistically 'borrow' resources not in use by different tenants - loaning, borrowing and pre-emption policies are specified in flexible resource sharing plans that can vary with time."

The whole system can also sit on top of IBM's well-regarded General Parallel File System, which gives it some capabilities more advanced than the main open-source equivalent, the Hadoop Distributed File System. Google is likely to field its own tech in this arena, but has published very little on it.

From what we understand, these capabilities mean IBM's PRS is more advanced than parts of the open-source Apache Mesos project – though at the cost of being proprietary and hence only having one major developer (IBM) driving the project.

One drawback of Big Blue's approach is its dependence on full virtualization, which means when passing information between two VMs on the same server there is an overhead. This compares with kernel-level direct transfers within Omega and Mesos thanks to containerization via cgroups, and so on.

IBM says it already has some customers running in the range of 50,000-cores – hardly Google, but not insignificant.

Though the technology strikes this hack as being handy for the few companies out there with boisterous, instance-filled OpenStack environments not already under some kind of scheduler, it seems unlikely it can maintain feature parity with the open-source scheduler and resource placer Apache Mesos.

Mesos is already in wide use at Twitter – the company hired Benjamin Hindman, co-creator of the tech, recently – and has also been used by trendy room-renting network Airbnb. IBM argues that the Mesos project as it stands is immature – true, but with hefty resources behind it, that may not remain the case.

The prerequisites for enterprises wanting to have a nibble at IBM's answer to Google's most advanced system is the use of IBM Power Systems or IBM System x (including iDataPlex), Red Hat Enterprise Linux 6.3, and IBM SmartCloud Entry V3.2.

Though many view IBM's recent OpenStack love-in as more marketing than substance, this release shows that in some parts of Big Blue's titanic organization, some very clever people are working to supercharge the open-source project – for a price. ®

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
'Kim Kardashian snaps naked selfies with a BLACKBERRY'. *Twitterati gasps*
More alleged private, nude celeb pics appear online
Wanna keep your data for 1,000 YEARS? No? Hard luck, HDS wants you to anyway
Combine Blu-ray and M-DISC and you get this monster
US boffins demo 'twisted radio' mux
OAM takes wireless signals to 32 Gbps
Google+ GOING, GOING ... ? Newbie Gmailers no longer forced into mandatory ID slurp
Mountain View distances itself from lame 'network thingy'
Apple flops out 2FA for iCloud in bid to stop future nude selfie leaks
Millions of 4chan users howl with laughter as Cupertino slams stable door
Students playing with impressive racks? Yes, it's cluster comp time
The most comprehensive coverage the world has ever seen. Ever
Run little spreadsheet, run! IBM's Watson is coming to gobble you up
Big Blue's big super's big appetite for big data in big clouds for big analytics
Seagate's triple-headed Cerberus could SAVE the DISK WORLD
... and possibly bring us even more HAMR time. Yay!
prev story


Secure remote control for conventional and virtual desktops
Balancing user privacy and privileged access, in accordance with compliance frameworks and legislation. Evaluating any potential remote control choice.
Intelligent flash storage arrays
Tegile Intelligent Storage Arrays with IntelliFlash helps IT boost storage utilization and effciency while delivering unmatched storage savings and performance.
WIN a very cool portable ZX Spectrum
Win a one-off portable Spectrum built by legendary hardware hacker Ben Heck
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Beginner's guide to SSL certificates
De-mystify the technology involved and give you the information you need to make the best decision when considering your online security options.