Dell crafts mother of all graphics cards
16 GPUs jammed in 3U chassis
Here's just what you need to play Crysis. The bespoke server division of Dell, called Data Center Solutions and accounting for a sizeable portion of Dell's quarterly server volumes, has cooked up a PCI-Express expansion chassis that can house up to 16 GPU co-processors.
The new expansion chassis is part of the PowerEdge C class of machines that Dell launched  in March. These PowerEdge C machines fall somewhere between the general purpose PowerEdge boxes Dell sells directly and through channel partners and the truly custom boxes made by DCS; they are not sold directly over the web, but rather through a sales engagement.
Generally speaking, the PowerEdge C machines are aimed at hyperscale computing jobs and focus on density and power efficiency. The C1100 is a 1U box and the C2100 is a 2U box, both with two sockets designed to run Intel's latest six-core Xeon 5600 processors; the C6100 puts four half-width two-socket Xeon 5600 systems into a 2U rack chassis, and is the densest rack server that Dell sells - with the possible exception of custom boxes no one but the customer and Dell know about.
The PowerEdge C410x is a PCI-Express expansion chassis which is meant to link to servers over PCI-Express links. According to Joe Sekel, a systems architect for the DCS unit at Dell, the 3U chassis comes with fan-outs and PCI-Express switches that let from one to four GPUs in the chassis to be allocated to a single server. Each GPU co-processor is wrapped in a metal skin that Sekel called a "taco," which directs the airflow around the GPU cards so they don't cause each other to overheat.
The C410x chassis puts ten GPUs in the front, where the cold aisle is in a data center, and six more in the back, where the hot aisle is. There is room for eight PCI-Express links and four power supplies (with 3+1 redundancy). The unit has eight fans that are all hot pluggable (7+1 redundancy), and the GPU co-processors themselves are also hot-pluggable once they are encased in the taco enclosure.
Basically, this is a GPU blade server, as you can see below:
Dell's PowerEdge C410x GPU enclosure.
The PowerEdge C410x chassis is certified to use the older Nvidia Tesla M1060 fanless GPU co-processors or the new Tesla M2050 fanless units, which debuted  in May using the latest Fermi GPUs.
The fanless design of the M1060 and M2050 uses big ole heat sinks to take the heat off the GPU and the cooling fans of the server - or in this case, PCI expansion chassis - to keep the units from melting spectacularly. Dell has not certified the fanless FireStream GPU co-processors from AMD, launched  in June, for the C410x enclosures.
The M2050 GPU co-processor is rated at the same 515 gigaflops of double-precision and 1.03 teraflops single-precision floating point performance, which means you can pack up to 16.48 teraflops of single-precision oomph into a single C410x chassis. The M2070 will have the same flops ratings, but will include 6GB of GDDR5 memory to the M2050's 3GB; the M2070 is not yet certified for the C410x because it is not yet shipping from Nvidia.
The AMD FireStream 9350 fanless GPU co-processor is rated at two teraflops SP and 400 gigaflops DP, and is something some customers probably want. The FireStream 9370 is a double-wide card rated at 2.64 teraflops SP and 528 gigaflops DP, but has 4GB of GDDR5 memory - twice that of the FireStream 9350.
At the moment, Dell is pitching the C410x as a companion to the C6100 four-server rack mounted machine, which would allow four GPUs to be linked to four servers, or one per Xeon CPU socket. Sekel says that the reason Dell created the C410x is that an oil and gas customer doing seismic processing wasn't sure if it needed two, three, or four GPUs per physical server to run its workloads. So Dell built a box that could dynamically allocate GPUs to machines.
The C410x is not restricted to GPU co-processors. Any PCI-Express 2.0 x16 device can, in theory, be wrapped in the metal taco, made hot-pluggable, and slid into the chassis if customers need it. Sekel said one obvious candidate was a Fusion-io PCI-Express flash drive, or any number of PCI-Express peripherals.
The only requirement is that each PCI-Express module has to burn 225 watts or less. The C410x uses 1,400 watt power supplies and has a maximum draw of 3,600 watts from the PCI-Express slots. That's a pretty tight power budget, with only 600 watts of headroom to spare for inefficiency, switching, fans, and der blinken lights.
The PowerEdge C410x chassis is available now. Dell does not give out pricing for any of the PowerEdge C class iron. ®