So you want to build the next Google. Who ya gonna call? Er, Big Blue?

IBM's cluster scheduler kicks OpenStack's Nova in teeth, eyes VMware

Beginner's guide to SSL certificates

Analysis IBM has announced a new version of its Platform Resource Scheduler (PRS), which lines up jobs and resources in mammoth OpenStack Havana environments.

In doing so, Big Blue hopes to give enterprises a shot at achieving the same levels of efficiency as Google's highly tuned servers.

Though the tech competes against VMware's Distributed Resource Scheduler, it could become a credible general-purpose job scheduler to rival Google's secretive Borg and Omega systems, and the Apache Mesos project.

A resource scheduler and workload placer is a system that takes jobs, and figures out when to run them and where to run them to maximize IT utilization. It must also leave some spare capacity, rather than consume all the available infrastructure, to ensure there's redundancy to pick up from any failures. And it must hit its deadlines.

Google's Borg system is rumored to have been so good at this task juggling act that it saved the ad-slinger from building an entire data center.

IBM's resource scheduling tech is designed as a drop-in replacement for the scheduler within the Nova component of the open-source cloud manager OpenStack. Nova makes scheduling decisions according to information it stores during its setup, and it selects jobs for compute nodes whose configurations match various filters.

PRS, by contrast, uses the distributed agent framework in Big Blue's Platform Computing products, which considers realtime "machine and hypervisor loads" among other information when making decisions. Thus, PRS can look at the available compute capacity in realtime and make ongoing judgements when placing workloads. It can shift things around as needed using the underlying hypervisor's live migration ability.

"This means that as workloads and resources evolve, workload placement is automatically re-balanced," IBM marketing chap Gord Sissons told The Reg via email.

"The key benefits are: better quality of service in terms of performance and availability, because hypervisors are less likely to be over-subscribed; better utilization, since [virtual machines] can be packed more optimally while respecting service level requirements; and reduced administrator workload, since the re-balancing is automated.

"This is important as OpenStack environments get large. The real 'intellectual property' in the offering is in the pre-configured policies - the idea is that a cloud administrator can simply specify a policy like 'load balancing' or 'packing', and the scheduler will automatically seek to achieve the goal of the policy."

It'll babysit your 50,000 cores. If you can afford it

It's worth noting that this system is unlikely to have the capabilities of Google's Omega system, which is believed to draw on CPU-core-level telemetry from a system named CPI2, along with other Chocolate Factory innovations.

However, by drawing on other IBM technology such as Platform Symphony, it is able to gain some advanced abilities, such as the aforementioned distributed agent-based scheduling, which (we're told) lets IBM's tech "opportunistically 'borrow' resources not in use by different tenants - loaning, borrowing and pre-emption policies are specified in flexible resource sharing plans that can vary with time."

The whole system can also sit on top of IBM's well-regarded General Parallel File System, which gives it some capabilities more advanced than the main open-source equivalent, the Hadoop Distributed File System. Google is likely to field its own tech in this arena, but has published very little on it.

From what we understand, these capabilities mean IBM's PRS is more advanced than parts of the open-source Apache Mesos project – though at the cost of being proprietary and hence only having one major developer (IBM) driving the project.

One drawback of Big Blue's approach is its dependence on full virtualization, which means when passing information between two VMs on the same server there is an overhead. This compares with kernel-level direct transfers within Omega and Mesos thanks to containerization via cgroups, and so on.

IBM says it already has some customers running in the range of 50,000-cores – hardly Google, but not insignificant.

Though the technology strikes this hack as being handy for the few companies out there with boisterous, instance-filled OpenStack environments not already under some kind of scheduler, it seems unlikely it can maintain feature parity with the open-source scheduler and resource placer Apache Mesos.

Mesos is already in wide use at Twitter – the company hired Benjamin Hindman, co-creator of the tech, recently – and has also been used by trendy room-renting network Airbnb. IBM argues that the Mesos project as it stands is immature – true, but with hefty resources behind it, that may not remain the case.

The prerequisites for enterprises wanting to have a nibble at IBM's answer to Google's most advanced system is the use of IBM Power Systems or IBM System x (including iDataPlex), Red Hat Enterprise Linux 6.3, and IBM SmartCloud Entry V3.2.

Though many view IBM's recent OpenStack love-in as more marketing than substance, this release shows that in some parts of Big Blue's titanic organization, some very clever people are working to supercharge the open-source project – for a price. ®

Security for virtualized datacentres

More from The Register

next story
It's Big, it's Blue... it's simply FABLESS! IBM's chip-free future
Or why the reversal of globalisation ain't gonna 'appen
'Hmm, why CAN'T I run a water pipe through that rack of media servers?'
Leaving Las Vegas for Armenia kludging and Dubai dune bashing
Microsoft and Dell’s cloud in a box: Instant Azure for the data centre
A less painful way to run Microsoft’s private cloud
Facebook slurps 'paste sites' for STOLEN passwords, sprinkles on hash and salt
Zuck's ad empire DOESN'T see details in plain text. Phew!
CAGE MATCH: Microsoft, Dell open co-located bit barns in Oz
Whole new species of XaaS spawning in the antipodes
AWS pulls desktop-as-a-service from the PC
Support for PCoIP protocol means zero clients can run cloudy desktops
prev story


Choosing cloud Backup services
Demystify how you can address your data protection needs in your small- to medium-sized business and select the best online backup service to meet your needs.
Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.