Feeds

Amazon rejigs EC2 to run parallel HPC apps

A veritable cluster

Internet Security Threat Report 2014

Online retailer and IT disrupter Amazon is getting its high performance computing act together on its Elastic Compute Cloud (EC2) service by allowing customers to spin up tightly coupled virtual server nodes to run real-world, parallel supercomputing applications.

On Tuesday, Amazon Web Services launched a new service called Cluster Compute Instances, which takes a bunch of x64 servers using Intel's Xeon processors and links them together using 10 Gigabit Ethernet interfaces and switches. As you can see from the Cluster Compute Instances sign-up page, the EC2 virtual server slices function just like any other sold by Amazon, except that the HPC variants have 10 Gigabit Ethernet links and also have a specific hardware profile so propellerheads can seriously tune their applications to run well.

With other EC2 slices, you never know what specific iron you are going to get when you buy a small, medium, large, or extra large virtual slice rated at a certain number of EC2 compute units.

In the case of HPC-specific slices, Amazon is providing a slice that has a two-socket x64 server based on Intel's Xeon X5570, which has a clock speed of 2.93 GHz and 8 MB of on-chip cache memory. Those processors are in the quad-core "Nehalem-EP" family that was announced by Intel in March 2009, not the latest six-core "Westmere-EP" Xeon 5600s that debuted in March of this year. (Amazon could easily plug six-core Xeon 5600s in these machines, since they are socket compatible with the Xeon 5500s).

This server represents an aggregate of 33.5 EC2 compute units and presents 23 GB of virtual memory to the HPC application running atop it. This is four times the extra large EC2 slice in terms of compute units, according to Amazon. The chips run in 64-bit mode, which is necessary to address more than 2 GB of memory in a node.

HPC shops are not generally keen on hypervisors because they eat CPU cycles and generally add network and storage I/O latencies, but at a certain price, some people will try anything and make do, and thus the Cluster Compute Instances on EC2 are based on the Amazon variant of the Xen hypervisor (called Hardware Virtual Machine, or HVM) to virtualize the server's hardware. Amazon requires that the cluster nodes be loaded with an Amazon Machine Image (AMI) stored on Amazon's Elastic Block Storage (EBS) storage cloud.

At the moment, Amazon is restricting the cluster size to eight instances, for a total of 64 cores. This is not a particularly large cluster, probably something on the order of 750 gigaflops of peak theoretical number-crunching oomph before you take out the overhead of virtualization. But it is more than a lot of researchers have on their workstations and PCs, and that is the point. If you want to get more oomph, you can request it.

Clearly larger configurations will not only be available, but are necessary. In the announcement, Lawrence Berkeley National Laboratory, which had been testing HPC applications on the EC2 cloud, said that the new Cluster Compute Instances had a factor of 8.5 times better performance than other EC2 instances that it had been testing. While LBNL was not specific, presumably it was using slow Gigabit Ethernet and perhaps less impressive iron. (Amazon had better hope that was the case).

Peter De Santis, general manager of the EC2 service at Amazon, said that an 880 server sub-cluster was configured to run the Linpack Fortran benchmark test to rank supercomputer power, and was able to deliver 41.82 teraflops (presumably sustained performance, not peak). If by "server" De Santis meant a physical server, then roughly half of the peak flops in the machines are going up the chimney on the EC2 slices.

That sounds pretty awful, but if you sift through the latest Top 500 rankings to find an x64 cluster using 10 Gigabit Ethernet interconnects, you'll see the fattest one is the "Coates" cluster at Purdue University, which is based on 7,944 quad-core Opterons running at 2.5 GHz cores, is rated at a peak 79.44 teraflops but on the Linpack test only delivers 52.2 teraflops. So 34 per cent of the flops on the unvirtualized cluster go up the chimney.

InfiniBand networks deliver a much better ratio because of their higher bandwidth and lower latency, which is why HPC shops prefer them and why Amazon will eventually have to offer InfiniBand too, if it wants serious HPC business. And eventually, Amazon will also have to offer GPU co-processors as well because codes are being adapted to use their relatively cheap teraflops.

As you can see from Amazon's EC2 price list, the Cluster Compute Instances cost $1.60 per hour for on-demand slices, which is actually quite a bit less than the $2.40 per hour Amazon is charging for generic quadruple extra large instances with fat memory. So it looks like Amazon understands that HPC shops are cheapskates compared to other kinds of IT organizations. If you want to reserve an HPC instance for a whole year, you're talking $4,290 and for three years, it's $6,590, plus 56 cents per hour usage.

The HPC slices on EC2 are available running Linux operating systems and are for the moment restricted to the North Virginia region of Amazon's distributed data centers in the United States. (Right next to good old Uncle Sam). No word on when the other regions in the US get HPC slices, or when it will be available in other geographies. Amazon had not returned calls as El Reg went to press to get some more insight into how it will be rolled out in Amazon's Northern California, Ireland, and Singapore data centers. ®

Security for virtualized datacentres

More from The Register

next story
Just don't blame Bono! Apple iTunes music sales PLUMMET
Cupertino revenue hit by cheapo downloads, says report
The DRUGSTORES DON'T WORK, CVS makes IT WORSE ... for Apple Pay
Goog Wallet apparently also spurned in NFC lockdown
IBM, backing away from hardware? NEVER!
Don't be so sure, so-surers
Hey - who wants 4.8 TERABYTES almost AS FAST AS MEMORY?
China's Memblaze says they've got it in PCIe. Yow
Microsoft brings the CLOUD that GOES ON FOREVER
Sky's the limit with unrestricted space in the cloud
This time it's SO REAL: Overcoming the open-source orgasm myth with TODO
If the web giants need it to work, hey, maybe it'll work
'ANYTHING BUT STABLE' Netflix suffers BIG Europe-wide outage
Friday night LIVE? Nope. The only thing streaming are tears down my face
Google roolz! Nest buys Revolv, KILLS new sales of home hub
Take my temperature, I'm feeling a little bit dizzy
prev story

Whitepapers

Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
New hybrid storage solutions
Tackling data challenges through emerging hybrid storage solutions that enable optimum database performance whilst managing costs and increasingly large data stores.
Reducing the cost and complexity of web vulnerability management
How using vulnerability assessments to identify exploitable weaknesses and take corrective action can reduce the risk of hackers finding your site and attacking it.