Original URL: http://www.theregister.co.uk/2011/09/22/appro_sdsc_gordon_supercomputer/

Work begins on radical Gordon super flash-computer

One of Intel's secret Xeon E5 testbeds, perhaps?

By Timothy Prickett Morgan

Posted in HPC, 22nd September 2011 16:28 GMT

Supercomputer maker Appro International has finally begun building the "Gordon" flash-heavy supercomputer at the San Diego Supercomputer Center, which was funded by a $20m grant from the National Science Foundation nearly two years ago. The machine is a testbed to analyze what happens when you get the I/O and floating point operations in a parallel supercomputer in balance.

Gordon has been waiting for Intel's "Sandy Bridge-EP" Xeon E5 processors and its "Lyndonville" Series 710 solid state drives. The latter was announced at Intel Developer Forum in San Francisco last week, while the former was not. Sort of.

Intel has said that the Xeon E5 processors, which sport on-chip PCI-Express peripheral controllers and Advanced Vector Extensions (AVX) for processing up to 8 floating point operations per clock, are actually shipping to selected cloud and HPC customers but will not be formally announced until next year.

It looks like SDSC might be getting early access to the Xeon E5 chips, since Allan Snavely, associate director at SDSC and co-principal investigator for the Gordon system, said that the SDSC hopes to have it up and running by January 2012.

Bang for the IOPS

The bulk of the flash storage in the Gordon system is not going into the server processing nodes, says Snavely, but rather in dedicated I/O nodes that use current two-socket "Westmere-EP" Xeon 5600 motherboards and el cheapo LSI disk controllers to play traffic cop for the 710 SSDs.

SDSC has picked the 300GB version of the Series 710 SSDs, which come in a 2.5-inch form factor and which link to the system board in the I/O node through a 3Gb/sec SATA interface. The 710 SSDs are based on 25 nanometer MLC flash and delivers 2,700 random write I/O operations per second (IOPS) and 38,500 random read IOPS using 4KB data chunks; they cost $1,929 each when bought in 1,000-unit quantities

The Gordon machine will have a total of 64 I/O nodes, which is based on a 3U rack server from Appro. Each node has over 4,200GB of flash capacity (14 flash drives, for 539,000 aggregate IOPS) and a number of controllers to link the flash through the motherboard to ConnectX-3 host adapters from Mellanox Technologies, exposing them to the 3D torus interconnect of the supercomputer.

The flash drives are aggregated on I/O nodes so you can create a larger file stored on the I/O node that you would be able to do on a single server, and the latency from the server out over the InfiniBand network to external storage is not all that large thanks to the peppiness of flash storage.

This is thanks to using Quad Data Rate (QDR) InfiniBand, which runs at 40Gb/sec, as well as having the InfiniBand cards plug into the on-chip PCI-Express controllers in the Xeon E5 processors, which deliver up to 80GB/sec of bandwidth into each processor socket, according to specs El Reg published earlier this year on the not-yet-announced processors.

Interestingly, Snavely says that SDSC started out thinking it would have to buy high-end SAS or SATA controllers with their own processing to handle the I/O bandwidth ion all of that flash. But it soon discovered that even if these are not the peppiest SSDs on the market, the Intel 710s can quickly choke even a high-end SATA controller and that the best thing to do was put a dumb controller into the server and let the Xeon 5600s handle the I/Os coming into and out of the I/O nodes. Snavely says that the 1.1 petabytes of write endurance of the 710 SSDs is more than enough to outlast the expected three to four year life of the Gordon super.

As El Reg has pointed out in its coverage of the 710 SSD launch last week, these flash drives are not speed demons by comparison with some of the other enterprise-grade SSDs out there on the market in PCI-Express and drive bay form factors. It's tough to beat Intel on dollars per GB or dollars per IOPS, explained Snavely, who just graduated a PhD student who did a thesis including a thorough price/performance analysis of flash technologies.

"We did an extremely comprehensive flash analysis," says Snavely. "There are some good ones out there, but looking at price per IOPS, Intel SSDs were among the best. And we also have a partnership with Intel, which gives us a good price."

Bang for the flops

The 1,024 server nodes in the Gordon supercomputer will also have flash and will be based on a future GreenBlade blade server from Appro packaged up its Xtreme-X chassis. Each node gets an Intel X25-M2 SSD, which is used to store the operating system image for the node. Those servers are configured with a fairly modest 64GB of main memory.

All told, the Gordon machine will have well over the contracted 200 teraflops of floating point performance as well as 35 million aggregate IOPS of random flash read performance across its I/O nodes. The cluster will run SDSC's own derivative of Red Hat Enterprise Linux and its open source Rocks cluster management software.

Rather than use a fat-tree network, Gordon uses a 3D torus interconnect that will, in theory, allow it to be extended more easily but there is no indication that this machine will be upgraded at this time. The InfiniBand network has two rails for redundancy and multipathing between the server and I/O nodes.

SDSC is going to use ScaleMP's vSMP virtual symmetric multiprocessing to create fat memory nodes with 2TB of main memory on the fly when workloads call for it. The machine will link to an external 4PB Lustre disk file system that can pump data into Gordon at a sustained 100GB/sec data rate.

So a lot of new ideas are being tested in the Gordon machine all at the same time. That is what NSF grants to academia are all about. ®