Tokyo Tech dumps Sun super iron for HP, NEC
Embraces GPUs for Tsubame 2.0
The Tokyo Institute of Technology - which put Sun Microsystems back on the HPC map along with floating point accelerator maker ClearSpeed back in the summer of 2006, with the 87 teraflops Tsubame 1.0 supercomputing cluster - has decided to go with different vendors and technologies in its next generation 2.4 petaflops Tsubame 2.0 machine.
Japanese server maker NEC, which has backed away from making massively parallel vector processors, is still the general contractor on the new Tsubame system, as it was on the one built four years ago using Sun Opteron servers fitted with ClearSpeed accelerators. But this time around, according to an announcement that Tokyo Tech made this week, it's not using Sun or ClearSpeed.
Instead of going with Oracle/Sun iron, Tokyo Tech's Global Scientific Information and Computing Center has, according to a translation of the announcement from Japanese into English compliments of Google, chosen servers made by Hewlett-Packard for a fair portion of the boxes that make up the Tsubame 2.0 cluster.
NEC is not a volume x64 rack or blade server player, but it does make vector supers and large SMP boxes based on Intel's Itanium and now "Nehalem-EX" Xeon 7500 processors. When its HPC customers want x64 boxes, therefore, NEC needs to partner to deliver the goods. That is why NEC tag-teamed with Sun for Tsubame 1.0 and why it is partnering with HP for Tsubame 2.0.
Tokyo Tech says that the Tsubame 2.0 super will have 1,400 compute nodes using six-core "Westmere-EP" Xeon 5600 processors in two-socket machines as well as eight-core Xeon 7500 processors in what everyone assumes will be large memory nodes. Tokyo Tech is also plunking 4,200 of Nvidia's latest embedded M2050 graphics co-processors into some of the nodes.
If you assume that the M2050s are going into the Xeon 5600 machines with four per box, that means there will be 1,050 two-socket nodes, leaving another 350 nodes for large-memory Xeon 7500 nodes. (Tokyo Tech did not provide the exact feeds and speeds of the machine, but will do so in mid-June). The announcement says that the Tsubame 2.0 machine will have approximately 17,000 cores, and the configuration above would mean 12,600 of them come from Xeon 5600 machines, leaving the remainder to be in 350 nodes using Nehalem-EX processors.
Assuming NEC wants to deploy two-socket servers using the special HPC variants of the Nehalem-EX chips, called the Xeon 6500s, and the six-core variants of the chips, that's another 4,200 cores. It works out to a total of 16,800 cores. The point of using either Xeon 7500 or Xeon 6500 chips even in a two-socket server is that each node can address a lot more memory than is possible with the Xeon 5600 processors.
As El Reg reported last fall, NEC and Intel had partnered to try to bring a special six-core HPC variant of the Nehalem-EX chip to market running at a higher clock speed than standard parts. Such a chip has not yet been announced - the Xeon 6500s run at 2 GHz with six cores or eight cores with a respective 105 watt and 130 watt profile. Intel might have wanted a higher clock speed, and it may yet deliver one this fall when Tsubame 2.0 is built, but the generic Xeon 7500 tops out at 2.26 GHz with eight cores and 2.66 GHz with six cores.
The Xeon 6500 chips differ from the Xeon 7500s in that they can only be used in two-socket boxes, and because they are used for HPC customers, they have somewhat lower prices (ranging from 9.8 to 13.5 per cent, depending on the chips you compare).
The Tsubame 2.0 will use storage arrays from DataDirect Networks (most likely the S2A990 arrays and Lustre parallel file system that HP just certified for its HPC clusters this week), and it will use Voltaire's quad-data rate (40 Gb/sec) InfiniBand switches to lash the server nodes together. The supercomputer will also have solid state disks, but exactly where these will be embedded in the Tsubame 2.0 system has not been divulged.
Tokyo Tech says that the Tsubame 2.0 cluster will employ server virtualization and support both Linux and Windows HPC Server 2008. ®
Sponsored: The Nuts and Bolts of Ransomware in 2016