Penguin puts Linux supercomputer in sky
InfiniBand with wings
Updated Hitching a ride on that ubiquitous cloud metaphor, Penguin Computing has unveiled a Linux supercomputer in the sky.
Today, the San Francisco-based outfit announced the debut of what it calls Penguin on Demand - POD, for short - a service that offers remote access to high-performance computing (HPC) Linux clusters. The idea is to provide researchers, engineers, and simulation scientists with the sort of number-crunching power they can't get from the typical so-called infrastructure cloud.
None to surprisingly, Penguin paints its new service as something that goes above and beyond Amazon's Elastic Compute Cloud (EC2). Amazon does offer high-end number crunching through its Elastic MapReduce service - which runs the open-source Hadoop grid platform atop EC2 - but Penguin CEO Charles Wuischpard paints POD as something altogether different, choosing to compare it with the basic EC2 service.
"We've taken our expertise as HPC specialists and applied it to an on-demand model," says Charles Wuischpard, the CEO of Penguin Computing, which has spent the last decade selling HPC Linux clusters. "We were finding that engineers and scientists were going to Amazon and trying to run their code, but Amazon wasn't really designed to support engineering, scientific workloads. Everything we've done is designed to try to support those workloads in a very efficient way."
Brock Tice is one of those scientists. As vp of operations at the Baltimore, Maryland-based CardioSolv, he works to model, yes, the heart - simulating its mechanical and electrical activity. And though he can run some simulations on Amazon EC2 - or on individual local machines - more complex models require HPC. "We're tried on Amazon and it just doesn't scale," he tells The Reg. "We can run on single EC2 instances, but if we need to scale up to a dog or human heart, it's just impossible.
"The connections between Amazon's machines are Gigabit Ethernet and they're shared. If you fire up 10 machines and you want to run them like a cluster, some might be in the same rack, and others might be halfway across the data center, five or six switches away."
Tice and CardioSolv did test their simulations on Elastic MapReduce, which debuted only recently. But they're spent the past several months using Penguin's new service, which lets them tap a high-density Linux cluster without actually buying one. POD offers access to Linux boxes based on Intel's Xeon chip and Nvidia's Tesla supercomputing GPU. You can opt for InfiniBand interconnects as well as Gigabit Ethernet. And though this may stretch the cloud metaphor a bit, the service isn't virtualized. You're buying access to physical machines - in a single location.
"The closer you get to the hardware, the higher the performance is going to be," Wuischpard tells The Reg. "And this thing was designed as a supercomputer."
Though POD may eliminate the need for your own Linux cluster, Wuischpard is also pitching the idea of using the service in tandem with an existing local installation. "Most of our customers are gated by their budget or their floor-space or their power, and given their druthers they'd like to have more," he says. "Now, we can give them a cluster they can afford in-house as well as - for peak workloads or specialized simulations - the ability to access a much larger resource they could never afford if it wasn't made available on-demand."
It's the public cloud meets the private cloud all over again. If you can call this stuff cloudy. Lacking virtualization, POD isn't the dynamically scalable resource that EC2 is. It's a batch resource. Using a command-line interface, you put jobs into a queue, and it spits them back out.
Perhaps it's more reminiscent of IBM's Deep Computing Capacity on Demand - though Wuischpard argues that his service is, well, far superior. "The design point for them was to offer their older equipment as an on-demand resource," says the ex-IBMer. "So, fundamentally, they're offering slower, less capable machines than we're offering...it's not one of their main line pieces of business. It's more or less a sandbox off their research group.
"And we're able to marry it - in business sense - with our ability to deliver physical clusters to you as well." Using Penguin's existing cluster-management software, he continues, you can bring the public and the private under the same interface.
"You can take our software and include a new queue which is now the on-demand resource. So you can decide whether you want to run a job on a local machine or - when you need a lot more power - submit it up to the cloud, if you will."
How much does it cost? Penguin isn't quite saying - though Wuischpard promises the service will be no more expensive than Amazon's high-end offerings. Amazon's most expensive Linux instances are priced at $0.80 an hour. ®
Update: This story has been updated to include mention of Amazon's Elastic MapReduce service.