The Register® — Biting the hand that feeds IT

Feeds

Explicit shots: China's gorgeous flop-tastic Tianhe-2 supercomputer

The Sky River sequel has Intel – and a whole lot of noodles – Inside

Free ESG report : Seamless data management with Avere FXT

Pics Some of the feeds and speeds of the Chinese government's Tianhe-2 massively parallel ceepie-phibie supercomputer leaked out in May and then even more came out a week later ahead of the planned big splash at the International Super Computing shindig in Leipzig, Germany. But El Reg has some juicy pics of some of the key components for you to ogle.

Many of the details about the machine – particularly relating to the upgraded "Arch" TH Express-2 interconnect that lashes the 16,000 compute nodes in the machine together – remain obscure. But feast your eyes on these.

First, here is a picture of the Tianhe-2 chassis. As we previously explained, based on a report of the machine put together by Jack Dongarra, a professor at the University of Tennessee and one of the stewards of the Linpack supercomputer benchmark, the Chinese government's National University of Defense Technology has done a bit of integrating with the updated "Sky River" machine. (Sky River is what Tianhe means when translated to English, and it is what we in the West call the Milky Way when we look to the night sky.)

The Tianhe-2 server chassis

The Tianhe-2 server chassis

With Tianhe-2, two Arch-2 network interface chips and two "Ivy Bridge-EP" Xeon E5 compute nodes (each with two processor sockets) are on a single circuit board (even though they are logically distinct). This compute node plus one Xeon Phi coprocessor share the left half of the compute node and five Xeon Phis share the right side. The two sides can be electrically separated and pulled out separately for maintenance.

The Arch-2 NICs link to the Xeon E5 chipset through PCI-Express 2.0 ports on the NIC, which is unfortunate given the doubling of bandwidth with the move to PCI-Express 3.0 slots. (Maybe that is coming with the Arch-3 interconnect, if there is one on the whiteboard at NUDT?) There's one Arch-2 NIC per compute node; the three Xeon Phi coprocessors for each node link over three PCI-Express 3.0 x16 ports to the CPUs. Yup, the Xeon Phis can talk faster to the CPU than the CPU can talk to the Arch-2 interface. It is unknown how this imbalance might affect the performance of Tianhe-2.

Take a gander at the massive switch backplane circuit board for Tianhe-2:

The switch backplane for the Tianhe-2 supercomputer

The switch backplane for the Tianhe-2 supercomputer

This Arch-2 switch backplane has ports on both sides and it has signals that run at multiple-gigahertz speeds. The ports on the Arch-2 NICs can run at 10Gb/sec or 14Gb/sec. The shiny ports on the Switch RSW Blades below slot into the black ports on the switch backplane and comprise the local Arch-2 interconnect for a group of nodes in the rack.

The RSW switch blade for Tianhe-2

The RSW switch blade for Tianhe-2

One set of RSW switches is rotated 90 degrees in parts of the system for reasons that don't make sense to me – yet. But here is how the components plug together:

How the compute nodes, switch, and backplane come together in Tianhe-2

How the compute nodes, switch, and backplane come together in Tianhe-2

Eight of the ports on the RSW Switch Blade link to four compute drawers (with a total of eight Arch-2 ports) and it looks like the remaining four ports are used to link out to the 576-port switches that represent the aggregation layer in the Arch-2 network. The blades that implement this aggregation layer are called the Switch LSW Blade, and this is what they look like:

The LSW switch blade for Tianhe-2

The LSW switch blade for Tianhe-2

The Arch-2 interconnect has thirteen of these 576-port monsters, which appear to be made from many of these Switch LSW Blades. These switches use an opti-electrical transport technology developed by NUDT as well as a proprietary network protocol. And like all supercomputers, the switching gets a bit messy, particularly when you are linking together 16,000 nodes.

In China, presumably they call a tangle of cables noodles, not spaghetti

In China, presumably they call a tangle of cables noodles, not spaghetti

I just want the sales commission on the cable sales. ®

5 ways to reduce advertising network latency

Whitepapers

5 ways to reduce advertising network latency
Implementing the tactics laid out in this whitepaper can help reduce your overall advertising network latency.
Supercharge your infrastructure
Fusion­‐io has developed a shared storage solution that provides new performance management capabilities required to maximize flash utilization.
Avere FXT with FlashMove and FlashMirror
This ESG Lab validation report documents hands-on testing of the Avere FXT Series Edge Filer with the AOS 3.0 operating environment.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Email delivery: 4 steps to get more email to the inbox
This whitepaper lists some steps and information that will give you the best opportunity to achieve an amazing sender reputation.

More from The Register

next story
Dedupe-dedupe, dedupe-dedupe-dedupe: Flashy clients crowd around Permabit diamond
3 of the top six flash vendors are casing the OEM dedupe tech, claims analyst
Disk-pushers, get reel: Even GOOGLE relies on tape
Prepare to be beaten by your old, cheap rival
Hong Kong's data centres stay high and dry amid Typhoon Usagi
180 km/h winds kill 25 in China, but the data centres keep humming
Microsoft lures punters to hybrid storage cloud with free storage arrays
Spend on Azure, get StorSimple box at the low, low price of $0
WD unveils new MyBook line: External drives now bigger... and CHEAP
Less than £0.04/GB, but it loses the Thunderbolt speed
VMware vSAN test pilots: Don't panic but there's a chance of DATA LOSS
AHCI SATA controller won't play nice with Virtzilla's robo-storage beta
Pure poaches NetApp preacher
Stewart dumps disk array drama to fluff flash
StorNext gets revamp, Quantum claims 5x data throughput boost
Multi-threaded code, flash, metadata redesign and Infiniband support
prev story