Cray flogs X1 supercomputer
Aims for HPC Glory
If there is one thing that Seattle-based Cray Inc wants to do besides make a lot more money in the supercomputing market, it is to live up to the engineering genius of Seymour Cray, arguably the best HPC computer designer and visionary the world has ever seen, writes Timothy Prickett Morgan.
A Cray computer used to be a serious sign of status among spook agencies, government research facilities, and those few commercial entities that could afford such exotic gear. The world has changed a lot since the heydays of the vector supercomputer of two decades ago, but the corporate and national pride that comes from having a unique approach to supercomputing that gives one company an edge over others is one of the constants in the computer business. If the Cray X1, which was announced on schedule yesterday, performs as designed - and the early indications seem to be that it indeed does - then the battered company that is an amalgam of the former Cray and Tera Computers will once again be able to stand tall and chase hundreds of millions of dollars of HPC budgets and live up to the high expectations that Cray, the man, always had for Cray, the company.
To be sure, Cray does not have all of its eggs in one basket. It is designing a 40 teraflops Linux cluster called "Red Storm" for Sandia National Laboratories, based on AMD 64-bit Hammer processors, that is supposed to be able to scale to 100 teraflops. It also has a reseller agreement with Japanese rival NEC Corp to sell its SX-6 vector machines in North America, a reseller agreement with Dell Computer Corp to peddle Linux-based clusters based on Dell's PowerEdge servers, and has the Tera MTA transputer and the Cray T3E massively parallel supercomputer as alternative platforms. But the X1, formerly known as the SV2, is the crown jewels and it is the platform on which Cray is staking its reputation as it promises to build a system capable of delivering petaflops - that's millions of gigaflops - of aggregate processing power by 2010.
The Cray X1 is a massively parallel supercomputer that is based on a variant of the vector processors that Cray is famous for. The X1 marries a MultiStreaming Processor (MSP) - Cray's name for a collection of CMOS-based vector processors that are linked to create a virtual and much more powerful vector processor - to a distributed-shared memory architecture. Each X1 cabinet has four MSP nodes, each comprised of four 800MHz processors, and from 64GB to 256GB of memory that is distributed to each of the processors. This memory is actually Rambus DRAM, manufactured by Samsung. Each 800MHz processor in the MSP is rated at 12.8 gigaflops, which suggests that each processor can process a peak 16 floating point operations per clock cycle. The I/O subsystem in the X1 is based on Sun Microsystems Inc's Sun Fire 6800 servers, oddly enough, and that I/O subsystem was tested more than a year ago at the Ohio Supercomputing Center at Ohio State University in the States. Each cabinet has a rating of about 205 gigaflops, and Cray says pricing starts at $2.5m for this base machine. A typical X1 configuration is expected to cost from $5m to $40m.
A fully loaded X1 machine has 64 such cabinets, 1024 MSPs, and 4,096 processors with memory ranging from 16TB to 64TB of shared memory. Such a monstrous Cray X1 would be rated at a whopping 52.4 teraflops of computing power, and if such a machine were built today, it would be the fastest supercomputer in the world in terms of peak processing power and probably in terms of actual processing power, since Cray's whole point with the X1 was to design a parallel supercomputer that was not as inefficient as RISC/Unix machines lashed together with high-speed switches. Cray says that its system interconnect is faster than the alternatives in the Unix MPP market, but did not provide specifics on how it accomplished this feat. What it does say is that interconnect is based on a modified 3D torus topology that has 400GB/sec of aggregate bandwidth on a 16 node, 64 processor X1 configuration. MSP is equipped with a system port channel, which has a 1.2GB/sec peak I/O bandwidth. Peak memory bandwidth is 38.4GB/sec and peak processor cache bandwidth is double that at 76.8GB/sec.
A fully loaded X1 machine would reportedly cost in the range of $200m to $300m, which is a premium compared to Unix-based MPP machines like IBM Corp's pSeries 690 and HP's AlphaServers, but Cray is clearly expecting the actual performance of the X1 to justify the higher price.
Five X1 machines have already been tested by various customers, including the U.S. government. Just last week, Cray won an $8.4m multi-year order from Spain's National Institute of Meteorology (INM) for a Cray X1, which will increase that country's weather forecasting processing capabilities by a factor of 255 when the X1 machine is installed in mid-2003. In the meantime, INM is taking a placeholder SV1. Cray says that it will ship the X1 machines to customers before year's end, and that it expects the machine to contribute mightily to its 2003 financial results.