Univa skyhooks grids to clouds

Cloud control freak meets Grid Engine

Next gen security for virtualised datacentres

Univa, the upstart HPC software company that forked Oracle's Grid Engine this January, has grafted it onto both the public Amazon EC2 cloud and onto private clouds based on the open source Eucalyptus framework that clones EC2.

Why on earth would you need to mix grid software, which harvests compute cycles from clusters of machines for high performance computing workloads, with clouds, which virtualize server instances and let you create and destroy them with ease? Because sometimes it costs less to be inefficient on the compute side and more efficient on the human side of running a cluster.

Long before there were virtualized clouds of compute and storage pools, gridding software such as Grid Engine was created to be a workload-management and job-scheduling layer on top of a cluster, generally to do massively parallel calculations on cheap server clusters or harvested cycles on PCs. This is all well and good, but provisioning the servers in a cluster to run programs such as Grid Engine is still a pain in the neck.

That's why Univa created its own homegrown provisioning tools for Sun/Oracle's Grid Engine or its own Univa Grid Engine 8.0, the fork off the Grid Engine project. But even if you use Univa's provisioning tools, they only Grid Engine down on bare-metal servers and the underlying cluster itself – really its software stack – is not dynamic. It can't be rolled from one machine to another to, for example, get a performance boost from faster iron, nor can it be seamlessly burst out to public clouds such as Amazon's EC2 if the internal grid doesn't have enough oomph to do the crunching in the allotted time.

Perhaps more importantly, with grid software running on cloudy server infrastructure, you can use the cloud fabric as a workload manager – keeping workloads isolated while the run so they don't interfere with each other – much as a hypervisor and its virtual-machine containers have become a de facto workload manager/job scheduler for Windows and Linux operating systems

All of these reasons are why customers using Grid Engine approached Univa to skyhook the grid software onto cloudy infrastructure. As Gary Tyreman, president and CEO at Unix, tells El Reg: "There are a lot of organizations that are trying to bring Eucalyptus and Grid Engine together in their virtual computing environment. That's why we are bullish about putting hypervisors under grids. But to be honest, virtualization is not a technology that a lot of HPC shops are overly familiar with."

This runs counter to the idea that HPC shops don't like to use virtualization because of the performance penalties it imposes for compute and, more importantly, network and disk I/O. Tyreman says that CPU overhead for virtualization was the bottleneck a few years back, but recent generations of Intel Xeon and AMD Opteron processors have integrated virtualization features that can minimize the CPU overhead to near nothing.

Univa conducted serial-workload tests running Grid Engine atop a cloud based on Oracle's Xen hypervisor – think electronic-design grids or life-sciences grids where there's not a lot of multithreading in the application and not a lot of communication across the server node – and found that the CPU overhead averages somewhere around 2 per cent on a cluster based on modern Xeon 5500 or 5600 processors.

That's no big deal – and for those kinds of workloads, a cloud will help make the grid cluster easier to manage. In some cases, the Xen scheduler inside the hypervisor is actually better than the scheduler inside of Red Hat Enterprise Linux for a particular workload, and putting it on a cloud boosts performance by 2 to 4 per cent.

The marriage of grids and clouds is not yet for everyone – at least not yet. Tyreman says that on parallel HPC workloads, where you are using the message-passing interface (MPI) protocol to move data around the cluster as part of a simulation, the performance degradation of using virtualized server instances over bare-metal servers running Grid Engine can be on the order of 30 to 50 per cent. "The network I/O is what is so punishing," says Tyreman. "There are just so many layers of software."

The good news is that Univa can dispatch such parallel jobs to bare-metal Grid Engine machines. And as soon as I/O virtualization improves in the processor and chipsets, that overhead will be greatly reduced as well. (That's the plan at Intel and AMD, at least.)

The linkage between the Amazon EC2, Cloud.com, and Rackspace Cloud public clouds and any internal Eucalyptus clouds is done through a piece of software called UniCloud, which was developed by Univa to help a customer set up a cloud on EC2 running Grid Engine.

UniCloud supports the deployment of Grid Engine inside of Xen or VMware ESXi containers in whatever format the cloud framework supports – Amazon Machine Images (AMIs) on EC2, and so forth. Tyreman says that Univa has not added support for KVM yet because although "it is good and it is fast, it is not yet enterprise-ready".

Univa Grid Engine costs $99 per core per year if you want to run it on your internal bare-metal or virtualized cluster. The UniCloud add-on brings the price of the base Grid Engine 8.0 license up to $150 per core per year. To burst Grid Engine out to Amazon's EC2, you have to buy the EC2 instances and then pay a 2-cents per hour premium on top of the Amazon price. (That pricing is for a small instance; obviously it costs more on a larger EC2 instance.) Univa gives volume pricing as well for both internal and public cloud installations. ®

The essential guide to IT transformation

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
US regulators OK sale of IBM's x86 server biz to Lenovo
Now all that remains is for gov't offices to ban the boxes
Death by 1,000 cuts: Mainstream storage array suppliers are bleeding
Cloud, all-flash kit, object storage slicing away at titans of storage
Oracle reveals 32-core, 10 BEEELLION-transistor SPARC M7
New chip scales to 1024 cores, 8192 threads 64 TB RAM, at speeds over 3.6GHz
VMware vaporises vCHS hybrid cloud service
AnD yEt mOre cRazy cAps to dEal wIth
El Reg's virtualisation desk pulls out the VMworld crystal ball
MARVIN musings and other Gelsinger Gang guessing games
prev story


Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
7 Elements of Radically Simple OS Migration
Avoid the typical headaches of OS migration during your next project by learning about 7 elements of radically simple OS migration.
BYOD's dark side: Data protection
An endpoint data protection solution that adds value to the user and the organization so it can protect itself from data loss as well as leverage corporate data.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?