HP sprinkles GPU chips on new cookie sheet servers
Building block for 2.4 petaflops Tsubame 2 super
Hewlett-Packard jumps into the CPU-GPU fray today from Barcelona, Spain, where it is launching its second generation of cookie sheet servers, the SL6500 Scalable System.
The cookie sheet servers are designed by HP for hyperscale customers who need more density than standard rack-based servers can deliver, but who are also working on tight budgets and certainly do not want to pay a premium for the density of commercialised blade servers.
In response, server makers including HP, Dell, Silicon Graphics, IBM, Super Micro and a few others have created designs that pay homage to the minimalist home-grown server blueprints of Google. It was the search giant which initially slapped a motherboard on a rubber mat on a cookie sheet and said to hell with the whole server chassis thing.
Personally, in line with my own minimalist approach in my own data
centre closet, I went so far as to leave the motherboard on the foam packing in which it was sent from the vendor and just slapped running servers on a shelf - the top of a half-rack enclosure, or any other flat space near a power outlet will do - leaving the disk drives to dangle off to the side as they will.
The server doesn't mind being naked at all. But you can't pack them in densely if you strew them like dirty laundry in a teenager's bedroom. And that means making some kind of chassis into which the servers can neatly slide.
HP's initial cookie sheet servers were announced last June and were called the ProLiant SL6000s. They consisted of the z6000 2U chassis, which has room for four half-width, 1U-high server trays. The servers are 31 per cent lighter, 10 per cent less expensive, and more energy efficient. This is because the tray servers share larger and very efficient fans and power supplies instead of dedicating smaller fans and supplies to individual servers.
The SL6000s initially supported three Xeon 5500-based half-width servers: the SL160z (144GB max and two 3.5-inch disks), the SL170 (128GB max and six 3.5-inch disks), and the SL2x170 (which put two half-width nodes on a single tray with 128GB of memory max, one 3.5-inch disk per server, and room for other peripherals). In November, at the SC09 supercomputing trade show, HP delivered an Opteron-based variant of the first Xeon-based tray, called the SL165z and based on Advanced Micro Devices' six-core "Istanbul" Opteron 2400 processors.
Today's SL6500 cookie sheet servers are bigger, use more modern processors, are more energy efficient (with power supplies that are rated at over 94 per cent efficiency at 50 per cent plus load), and include InfiniBand and 10 Gigabit Ethernet switching right on the server nodes - no additional adapter and PCI-Express slot required.
As with the SL6000s, the SL6500s put the networking, disks, and server trays all in the front so they can be accessed from the cold aisles in data centers. Fans and power supplies are still only accessible from the hot aisle. The s6500 chassis is twice as tall as the s6000, at 4U of rack space, but the server trays come in 1U and now 2U heights. (You have to do funny things to a system board to get it flatter than 1U, but as T-Platforms has showed with its T-Blade 2 blade servers, you can get quite skinny if you try.)
The server nodes all plug into a shared power supply, and customers can pick from units rated at 460, 750, and 1,200 watts. There is room for redundant power supplies in the chassis, but the assumption at many hyperscale where the SL6500s will be sold is that server nodes are disposable and the high availability is build into the software stack. (Not so in a lot of HPC workloads, where a server crash might cause a delay in processing or perhaps even force a rollback to a checkpoint in a job that could run for weeks or months.) The s6500 chassis costs $1,099.
The HP ProLiant SL6500 Scalable System
The SL6500 system has three different half-width server nodes. The first one is the bare-bones two-socket tray, the SL170s G6. (And no, this is not exactly the same as the SL170z tray server used in the earlier Easy Bake chassis.) The S170s G6 tray server is based on Intel's 5520 chipset and supports either the quad-core Xeon 5500 or six-core Xeon 5600 processors.
It has 16 DDR3 memory slots, for a maximum of 128GB using 8 GB memory sticks or 192GB using 16GB sticks. (The Xeon 5500/5600 memory controller tops out at 96GB per socket, so you can't boost memory to 128GB per socket using 16GB sticks.) The server has an embedded HP Smart Array B110i RAID disk controller, and room for two 3.5-inch or four 2.5-inch SAS or SATA drives. The tray server comes with a single PCI-Express 2.0 x16 peripheral slot, a dual-port Gigabit Ethernet controller on the system board, and the bare-bones Lights Out 100i management controller.
With a single 2.4 GHz, four-core Xeon E5620, 6 GB of memory, and no disk, the SL170s G6 server costs $1,559. A beefier configuration with two six-core Xeon X5670 processors (running at 2.93 GHz) and 24GB of memory, the SL170s G6 server costs $5,979.
The next new tray server is the 1U high version of the SL390s G7. This tray is based on the same Intel chipsets and processors, but has only a dozen memory slots for a maximum of 96GB using 8GB sticks and 192GB using 16GB sticks. The SL390s G7 cookie sheet server has the same disk controller and disk options as the bare-bones SL170s G6 machine as well as the dual-port Gigabit Ethernet NIC, and the single x16 peripheral slot. However, this tray server also has HP's Integrated Lights Out 3 (iLO 3) service processor and more sophisticated DCMI 1.0/IMPI 2.0 management tools.
The HP ProLiant SL390s G7 1U server tray
And finally, this tray has one other interesting option: an integrated ConnectX-2 VPI adapter from Mellanox on the system board, which allows for the board to support a single 40 Gb/sec InfiniBand link, two 10 Gigabit Ethernet links, or a mix of one IB and one 10GE port.
The ports are reconfigurable at server boot time, so customers can change which ports they use as workloads change. (They will obviously have to do some recabling in some cases because IB and Ethernet use different ports.) Ed Turkel, manager of worldwide HPC marketing for HP's Enterprise Servers, Storage, and Networking group says that the on-board Mellanox chip is far less costly than a PCI-Express InfiniBand or 10 Gigabit Ethernet card.
The base SL390s G7 tray server comes with the 2.4 GHz, quad-core Xeon E5620 and 6GB of memory costs $2,259. With two Xeon X5677 (six-core chips that run at 3.46 GHz) and 24GB of memory, you're talking $7,279 per tray for the SL390s G7.
Now for the ceepie-geepie
The third configuration of server tray available for the SL6500 Scalable System is the SL390s G7 2U tray, which adds room for three of Nvidia's fanless M2050 or M2070 GPU co-processors and three PCI-Express 2.0 x16 links directly from the GPUs to the server system board. With the GPU option, you can put four two-socket servers (using six-core processors) and a dozen GPU co-processors into a 4U chassis. Pricing was not available for the server trays with the GPUs.
HP did not announce tray servers based on AMD Opterons or its FireStream GPU co-processors, but Turkel said that "we love all of our children" and that while he was not pre-announcing any products, it was reasonable to expect AMD options over time.
Tsubame 2 super detailed
Turkel says this is exactly what the Tokyo Institute of Technology has done to build up its Tsubame 2 CPU-GPU hybrid supercomputer, announced in May. What we didn't know when this machine was launched is that it will consist of slightly over 1,400 of the SL390s 2U nodes with three GPUs per server packed into the cookie sheet chassis. Turkel says that TiTech had some pretty tough space and thermal limitations - only 200 square meters of space and 1.8 megawatts of power to create a petaflops-class super - and that is why it designed the SL6500s with the GPU options.
When it is fully configured, Tsubame 2 will have an aggregate of 2.4 petaflops of processing power. It will be using the integrated InfiniBand adapter on the server nodes, plus an extra PCI-Express x16 adapter for a second InfiniBand network. Voltaire is supplying the InfiniBand switches linking the nodes together, and DataDirect Networks is supplying 7 PB of Lustre file system storage for the nodes.
The Tsubame 2 super is in the final stages of construction now, and you can bet that HP will be working hard with TiTech to get Linkpack running on it and certified so the machine can be near the top of the Top 500 supercomputer ranking due at SC10 in November. ®