Feeds

Amazon rejigs EC2 to run parallel HPC apps

A veritable cluster

Beginner's guide to SSL certificates

Online retailer and IT disrupter Amazon is getting its high performance computing act together on its Elastic Compute Cloud (EC2) service by allowing customers to spin up tightly coupled virtual server nodes to run real-world, parallel supercomputing applications.

On Tuesday, Amazon Web Services launched a new service called Cluster Compute Instances, which takes a bunch of x64 servers using Intel's Xeon processors and links them together using 10 Gigabit Ethernet interfaces and switches. As you can see from the Cluster Compute Instances sign-up page, the EC2 virtual server slices function just like any other sold by Amazon, except that the HPC variants have 10 Gigabit Ethernet links and also have a specific hardware profile so propellerheads can seriously tune their applications to run well.

With other EC2 slices, you never know what specific iron you are going to get when you buy a small, medium, large, or extra large virtual slice rated at a certain number of EC2 compute units.

In the case of HPC-specific slices, Amazon is providing a slice that has a two-socket x64 server based on Intel's Xeon X5570, which has a clock speed of 2.93 GHz and 8 MB of on-chip cache memory. Those processors are in the quad-core "Nehalem-EP" family that was announced by Intel in March 2009, not the latest six-core "Westmere-EP" Xeon 5600s that debuted in March of this year. (Amazon could easily plug six-core Xeon 5600s in these machines, since they are socket compatible with the Xeon 5500s).

This server represents an aggregate of 33.5 EC2 compute units and presents 23 GB of virtual memory to the HPC application running atop it. This is four times the extra large EC2 slice in terms of compute units, according to Amazon. The chips run in 64-bit mode, which is necessary to address more than 2 GB of memory in a node.

HPC shops are not generally keen on hypervisors because they eat CPU cycles and generally add network and storage I/O latencies, but at a certain price, some people will try anything and make do, and thus the Cluster Compute Instances on EC2 are based on the Amazon variant of the Xen hypervisor (called Hardware Virtual Machine, or HVM) to virtualize the server's hardware. Amazon requires that the cluster nodes be loaded with an Amazon Machine Image (AMI) stored on Amazon's Elastic Block Storage (EBS) storage cloud.

At the moment, Amazon is restricting the cluster size to eight instances, for a total of 64 cores. This is not a particularly large cluster, probably something on the order of 750 gigaflops of peak theoretical number-crunching oomph before you take out the overhead of virtualization. But it is more than a lot of researchers have on their workstations and PCs, and that is the point. If you want to get more oomph, you can request it.

Clearly larger configurations will not only be available, but are necessary. In the announcement, Lawrence Berkeley National Laboratory, which had been testing HPC applications on the EC2 cloud, said that the new Cluster Compute Instances had a factor of 8.5 times better performance than other EC2 instances that it had been testing. While LBNL was not specific, presumably it was using slow Gigabit Ethernet and perhaps less impressive iron. (Amazon had better hope that was the case).

Peter De Santis, general manager of the EC2 service at Amazon, said that an 880 server sub-cluster was configured to run the Linpack Fortran benchmark test to rank supercomputer power, and was able to deliver 41.82 teraflops (presumably sustained performance, not peak). If by "server" De Santis meant a physical server, then roughly half of the peak flops in the machines are going up the chimney on the EC2 slices.

That sounds pretty awful, but if you sift through the latest Top 500 rankings to find an x64 cluster using 10 Gigabit Ethernet interconnects, you'll see the fattest one is the "Coates" cluster at Purdue University, which is based on 7,944 quad-core Opterons running at 2.5 GHz cores, is rated at a peak 79.44 teraflops but on the Linpack test only delivers 52.2 teraflops. So 34 per cent of the flops on the unvirtualized cluster go up the chimney.

InfiniBand networks deliver a much better ratio because of their higher bandwidth and lower latency, which is why HPC shops prefer them and why Amazon will eventually have to offer InfiniBand too, if it wants serious HPC business. And eventually, Amazon will also have to offer GPU co-processors as well because codes are being adapted to use their relatively cheap teraflops.

As you can see from Amazon's EC2 price list, the Cluster Compute Instances cost $1.60 per hour for on-demand slices, which is actually quite a bit less than the $2.40 per hour Amazon is charging for generic quadruple extra large instances with fat memory. So it looks like Amazon understands that HPC shops are cheapskates compared to other kinds of IT organizations. If you want to reserve an HPC instance for a whole year, you're talking $4,290 and for three years, it's $6,590, plus 56 cents per hour usage.

The HPC slices on EC2 are available running Linux operating systems and are for the moment restricted to the North Virginia region of Amazon's distributed data centers in the United States. (Right next to good old Uncle Sam). No word on when the other regions in the US get HPC slices, or when it will be available in other geographies. Amazon had not returned calls as El Reg went to press to get some more insight into how it will be rolled out in Amazon's Northern California, Ireland, and Singapore data centers. ®

Security for virtualized datacentres

More from The Register

next story
It's Big, it's Blue... it's simply FABLESS! IBM's chip-free future
Or why the reversal of globalisation ain't gonna 'appen
'Hmm, why CAN'T I run a water pipe through that rack of media servers?'
Leaving Las Vegas for Armenia kludging and Dubai dune bashing
Bitcasa bins $10-a-month Infinite storage offer
Firm cites 'low demand' plus 'abusers'
Facebook slurps 'paste sites' for STOLEN passwords, sprinkles on hash and salt
Zuck's ad empire DOESN'T see details in plain text. Phew!
CAGE MATCH: Microsoft, Dell open co-located bit barns in Oz
Whole new species of XaaS spawning in the antipodes
Microsoft and Dell’s cloud in a box: Instant Azure for the data centre
A less painful way to run Microsoft’s private cloud
prev story

Whitepapers

Choosing cloud Backup services
Demystify how you can address your data protection needs in your small- to medium-sized business and select the best online backup service to meet your needs.
Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.