Original URL: http://www.theregister.co.uk/2004/10/05/cray_xd1/
Cray comes to market with XD1
Mid-range slugger conquers cluster
Cray yesterday announced the general availability of the new family of AMD Opteron-based supercomputers. Cray XD1 mini-supercomputer systems are priced from under $100,000 to about $2m (US list price), placing them in the mid-range system category. XD1s run Linux but are capable of outperforming similarly priced Linux clusters, thanks to "superior parallel-processing architecture", Cray says.
The XD1, based on technology acquired by Cray when it bought start-up OctigaBay, is designed to help Cray compete at the lower end of the super-computer market against companies such as IBM and HP. Cray pitches the system as suitable for a wide range of high-performance computing (HPC) applications.
Early customers include the Pacific Northwest National Laboratory (PNNL), Germany's Helmut Schmidt University and the SAHA Institute of Nuclear Physics (Calcutta, India) and the US Department of Agriculture Forest Service. The Forestry Service, for example, will use the number-crunching power of the XD1 to work out the chemical composition of smoke plumes.
The Cray XD1 features a "direct connect processor (DCP) architecture, which removes PCI bottlenecks and memory contention to deliver superior sustained performance". According to HPC Challenge benchmarks, the Cray XD1 has the lowest latency of any HPC system, with MPI latency of 1.8 microseconds and random ring latency of 1.3 microseconds.
Tests conducted by the Ohio Supercomputer Center show that the Cray XD1 ships messages with four times lower MPI latency than common cluster interconnects such as Infiniband and 30 times lower than Gigabit Ethernet as used in lowest-cost clusters. The Cray XD1's interconnect delivers twice the bandwidth of 4X Infiniband for messages up to 1 KB and 60 percent higher throughput for very large messages.
The Linux/Opteron system runs x86 32/64 bit codes. Field programmable gate arrays (FPGAs) are available to accelerate applications. System chassis can house up to 12 processors delivering 58 peak gigaflops, 96 GB/second aggregate switching capacity, 1.8-microsecond MPI interprocessor latency, 84 GB maximum memory and 1.5 TB maximum disk storage. A 12-chassis rack provides 144 compute processors, 691 peak gigaflops, 1TB/second aggregate switching capacity, two microsecond MPI interprocessor latency, 922 GB/second aggregate memory bandwidth, 1 TB maximum memory and 18 TB maximum disk storage. ®